WO2022057671A1 - Neural network–based knowledge graph inconsistency reasoning method - Google Patents

Neural network–based knowledge graph inconsistency reasoning method Download PDF

Info

Publication number
WO2022057671A1
WO2022057671A1 PCT/CN2021/116777 CN2021116777W WO2022057671A1 WO 2022057671 A1 WO2022057671 A1 WO 2022057671A1 CN 2021116777 W CN2021116777 W CN 2021116777W WO 2022057671 A1 WO2022057671 A1 WO 2022057671A1
Authority
WO
WIPO (PCT)
Prior art keywords
axiom
triplet
neural network
representation
inconsistency
Prior art date
Application number
PCT/CN2021/116777
Other languages
French (fr)
Chinese (zh)
Inventor
陈华钧
李娟�
张文
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2022057671A1 publication Critical patent/WO2022057671A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • the invention belongs to the field of knowledge graphs and neural networks, and in particular relates to a knowledge graph inconsistency reasoning method based on neural networks.
  • Knowledge graph is a knowledge system formed by structuring knowledge, and has been widely used in knowledge-driven tasks such as search engines, recommendation systems, and question answering systems.
  • large-scale knowledge graphs for open and vertical domains have been constructed using manual annotation, semi-automatic or automated methods.
  • Classical knowledge graphs such as Wikidata, Freebase, and DBpedia, use triples to store relationships between entities and entities. Each triplet corresponds to a piece of knowledge, for example (China, the capital is, Beijing) means that the capital of China is Beijing, where "China" is called the head entity, "Beijing" is called the tail entity, and "the capital is” is called for relationship.
  • the constructed graphs may have quality problems such as incomplete knowledge and knowledge errors.
  • many knowledge graph representation learning algorithms have been proposed to predict new knowledge to complement existing knowledge graphs, mainly predicting missing entities by given entities and relationships, or predicting between two entities given two entities. possible relationship.
  • Aiming at the problem of knowledge error it includes two situations: not conforming to the axioms and conforming to the axioms but the knowledge content is incorrect, mainly through knowledge inconsistency reasoning to realize the detection of wrong knowledge in order to correct the errors.
  • knowledge graph inconsistency reasoning is to detect wrong knowledge. It contains two types of tasks, one is to detect whether a triple is consistent, and the other is to detect whether an axiom of a triple is consistent. Perform binary classification.
  • the existing work on inconsistent reasoning of triples in the knowledge graph requires the knowledge base ontology information, or some custom templates.
  • Existing knowledge bases usually do not have well-defined ontologies, and manual definition by experts is not only time-intensive, but also leads to the problem of incomplete ontologies.
  • the current knowledge representation learning models include distance models, semantic models, and neural network models, etc., which embed knowledge graphs into low-dimensional vector spaces, and represent entities and relationships in triples as low-dimensional vectors.
  • the above methods obtain the score of the triplet through vector calculation to determine whether a piece of knowledge is correct, which corresponds to the triplet classification task.
  • these methods reflect the advantages of knowledge representation learning algorithms and neural networks in knowledge graph reasoning tasks.
  • these methods can only judge the consistency of triples, and cannot judge whether the axioms corresponding to triples are consistent in fine-grained manner. Therefore, it is urgent to detect whether a certain axiom of triples is consistent to realize the detection of wrong knowledge. for error correction.
  • the purpose of the present invention is to provide a knowledge graph inconsistency reasoning method based on neural network, which does not require given ontology information, uses neural network to learn the inconsistency axiom, and judges whether a triplet is a triple through knowledge representation learning algorithm and neural network. There is an inconsistency, whether there is an inconsistency on the given axioms.
  • a neural network-based knowledge graph inconsistency reasoning method comprising the following steps:
  • the knowledge representation learning algorithm is used to learn the entity representation and relationship representation of the triplet in the knowledge graph, and the representation score of the triplet is calculated at the same time.
  • Representation and relational representation are used as the input of the neural network, using triples to model the axioms through the neural network to learn the parameters of the neural network used to represent the corresponding axioms to obtain the axiom model, and using the axiomatic model to obtain the axiom prediction value of the triples , based on the representation score of the triplet and the axiom prediction value to realize the judgment of the inconsistency of the triplet and the corresponding axiom.
  • the knowledge graph is embedded in a low-dimensional vector space, entities and relationships are represented as vectors, and the structure information of triples is preserved through the knowledge representation learning algorithm.
  • triple Group s,r,o
  • the input is the vector representation of entities and relationships, and the output is the representation score of the current triplet.
  • the knowledge representation learning algorithm and the neural network are trained at the same time, so that when learning the vector representation and the neural network parameters, it can not only encode the structural information of the triplet, but also make the triplet conform to the relevant axioms at the same time.
  • the representation learning algorithm can be any knowledge graph representation learning algorithm, and the neural network performs binary classification to output the probability that each axiom satisfies the consistency.
  • the specific input of the neural network during the training process comes from the elements of the current triplet, or the elements in the related triplet found through the triplet.
  • the elements from triples or related triples that need to be considered for the corresponding axioms are analyzed; then, according to the assumptions of the axioms, the elements of the existing triples and their related triples are regarded as neural networks through the existing triples conforming to the axiom constraints.
  • the positive samples of each neural network module are constructed through closed assumptions. This setting allows the model to learn the axioms only by using the triples existing in the knowledge graph without ontology information.
  • the neural network model corresponding to each axiom is constructed, and the positive samples of each neural network model are constructed according to the constraints or conditions of the inconsistency corresponding to the axioms, and the negative samples are constructed based on the positive samples;
  • the corresponding entity representation and relationship representation are spliced and input into the neural network model, and the predicted value corresponding to each axiom is calculated and calculated based on all
  • the prediction value corresponding to the axiom is obtained by the prediction score of the neural network model, and the representation score and prediction score calculated by the knowledge representation learning algorithm are combined to obtain the total score of the triplet;
  • an edge loss function is constructed according to the total score of the positive sample triplet and the corresponding total score of the negative sample triplet, and the edge loss function is used to jointly update the optimized knowledge representation learning algorithm and neural network model parameters, as well as entity representation and relation representation.
  • the neural network model determined by the parameters is finally learned as the axiomatic model, as well as the entity representation and relation representation determined in the knowledge graph.
  • axioms when selecting axioms for inconsistency detection, first determine whether the axioms can be used for inconsistency detection according to the conditions or constraints mentioned in the axiom definition that need to be satisfied, and then Select axioms from the axioms that can be used for inconsistency detection, and benchmark the conditions or constraints corresponding to the selected axioms to the related elements of the triples in the knowledge graph and the correlation between the elements of the triples, and then realize the positive construction of the axioms. sample.
  • a relationship threshold is set for each relationship, and the knowledge representation is used according to the entity representation and relationship representation determined at the end of the optimization.
  • the learning algorithm and the axiomatic model calculate the total score of the triplet. When the total score of the triplet is lower than the relation threshold, the triplet is considered to be a correct triplet, otherwise it is a wrong triplet, which means there is inconsistency sex.
  • an axiom threshold is set for each axiom, and the entity vector representation and relationship determined when the knowledge representation learning model is generated is used.
  • Vector representation using each axiom model to calculate the predicted value of the triple for the axiom, when the predicted value of the triple for the axiom is lower than the corresponding axiom threshold, the triple is considered inconsistent on the axiom.
  • the selected axioms include:
  • the Object Property Domain referred to as the domain axiom, defines the type of the head entity s of the relationship r should conform to the corresponding category;
  • Object Property Range referred to as the range axiom, defines that the type of the tail entity o of the relation r should conform to the corresponding category;
  • Disjoint Object Properties referred to as the disjoint axiom, defines that relation r and relation r 1 are mutually exclusive, and triples (s, r, o) and triples (s, r 1 , o) should not exist at the same time in a knowledge graph;
  • the input of the neural network module is the element of the current triplet or the triplet related to the current triplet, and the output is the probability that conforms to the corresponding axiom.
  • the calculations of the neural network module are as follows:
  • the asymmetric axiom predicts the value:
  • Pasym (s,r,o,s 1 ,r,o 1 ) g(W 5 ⁇ [s;r;o;s 1 ;r;o 1 ]+b 5 );
  • f(s,r,o) f r (s,o)+ ⁇ (1-P dm )+ ⁇ (1-P rg )+ ⁇ (1-P irre )+ ⁇ (1-P dis )+ ⁇ (1-P asym )
  • s, s 1 , r, r 1 , o, o 1 respectively represent the learning representation vector, symbol of entity s, entity s 1 , relation r, relation r 1 , entity o, and entity o 1 ; represent the splicing operation, g () represents the sigmoid function, W 1 , W 2 , W 3 , W 4 , W 5 represent the weight vector, b 1 , b 2 , b 3 , b 4 , b 5 are biases, ⁇ , ⁇ , ⁇ , ⁇ is a hyperparameter representing the weight, and fr (s, o) represents the representation score of the triplet obtained by using the knowledge representation learning model.
  • the axiom probability value output by each axiomatic model is relatively higher than that of wrong triples, and the above scoring function ensures that the total score of triples of positive samples is lower than that of triples of negative samples.
  • the constructed marginal loss function is:
  • F represents the set of positive samples
  • F' represents the set of negative samples
  • f(s, r, o) represents the total score of the triplet (s, r, o)
  • f(s', r', o') represents the The total score for the triple (s',r',o').
  • a triple (s, r, o) is given, and the training phase is used to learn entities and relationships.
  • the vector representation and the parameters of the neural network model calculate the final score f(s,r,o), and the values of each axiom module including P dm , P rg , P irre , P dis , and P asym .
  • a threshold is introduced for each relationship.
  • the score f(s,r,o) is lower than the threshold, it means that the triplet is a correct triplet, otherwise it is a wrong triplet, That is, there is inconsistency.
  • a threshold is introduced for each axiom model according to the validation set. If P dm , P rg , P irre , P dis , and P asym are all lower than the corresponding axiom thresholds , it indicates that the triplet has inconsistency in domain axiom, range axiom, disjoint axiom, irreflexive axiom and asymmetric axiom respectively.
  • the beneficial effects of the present invention at least include:
  • the above-mentioned knowledge graph inconsistency reasoning method does not require well-defined ontology information, and only uses the existing triples in the knowledge graph for axiom learning, so that it can also be captured on knowledge graphs without ontology definitions or incomplete ontology definitions. Part of the axioms and inconsistency reasoning, greatly reducing labor costs. At the same time, the knowledge representation learning model and the axiomatic model can be used in any knowledge graph.
  • this method converts axiom learning into neural network parameter learning, and uses the knowledge representation learning algorithm and neural network joint training to allow the model to retain structural information and learn inconsistency. related axioms.
  • Axiom learning allows the model to detect not only inconsistent triples, but also fine-grained detection of inconsistencies for several axioms under consideration. In this way, inconsistency inference can be better implemented, and it is convenient for subsequent correction of triples.
  • Figure 1 is a flowchart of a neural network-based knowledge graph inconsistency reasoning method.
  • Figure 1 is a flowchart of a neural network-based knowledge graph inconsistency reasoning method. As shown in Figure 1, for a given knowledge graph containing a large number of triples (s, r, o), the knowledge graph inconsistency reasoning method includes the following steps:
  • Step 1 select the following five axioms from the axioms that can be used for inconsistency detection in the OWL2 object attribute axioms, and analyze the description and judgment conditions of the five axioms in OWL2 as follows:
  • Step 2 Obtain the training samples of the neural network corresponding to each kilometer according to the judgment conditions, that is, determine the specific input of each neural network according to the judgment conditions of the axioms.
  • the following is a sample knowledge graph to illustrate the input of each axiom model:
  • the given sample graph contains 6 triples, including entities (s 1 , s 2 , o 1 , o 2 ) and relations (r 1 , r 2 , r 3 ).
  • the domain axiom focuses on relations and head entities, and the module inputs are (r 1 ,s 1 ),(r 2 ,s 2 ),(r 3 ,s 1 ),(r 1 ,s 2 ),(r 3 ,o 1 );
  • the range axiom concerns relations and tail entities, the module inputs are (r 1 ,o 1 ),(r 2 ,o 2 ),(r 3 ,o 1 ),(r 1 ,o 2 ),( r 1 ,s 1 ),(r 3 ,s 1 );
  • the disjoint axiom focuses on the correlation of relations in triples where both head and tail entities are the same.
  • the input of this module is (s 1 ,r 1 ,o 1 , s 1 ,r 3 ,o 1 ),(s 2 ,r 2 ,o 2 ,s 2 ,r 1 ,o 2 ); the irreflexive axiom concerns the current triple, and the module input is (s 1 ,r 1 ,o 1 ),(s 2 ,r 2 ,o 2 ),(s 1 ,r 3 ,o 1 ),(s 2 ,r 1 ,o 2 ),(s 1 ,r 1 ,s 1 ),(o 1 , r 3 , s 1 ); the asymmetric axiom concerns two triples with the same relationship, the input is (s 1 ,r 1 ,o 1 ,s 2 ,r 1 ,o 2 ),(s 1 ,r 1 ,o 1 ,s 1 ,r 1 ,s 1 ),(s 2 ,r 1 ,o 2 ,s 1 ),(s 2 ,r 1 ,o 2
  • the knowledge representation learning algorithm focuses on the current triple, and the input is (s 1 ,r 1 ,o 1 ), (s 2 ,r 2 ,o 2 ), (s 1 ,r 3 ,o 1 ), (s 2 ,r 1 ,o 2 ), (s 1 ,r 1 ,s 1 ),(o 1 ,r 3 ,s 1 ).
  • the input of each of the above modules can be regarded as a positive sample for that module.
  • Step 3 jointly train the knowledge representation learning algorithm and the neural network corresponding to the axiom
  • the entity representation and relation representation of the sample are spliced and input into the neural network model, the predicted value corresponding to each axiom is calculated, and the neural network model is obtained based on the predicted value corresponding to all axioms
  • the prediction score of the triplet is combined with the representation score of the triplet and the prediction score of the axiomatic model to obtain the total score of the triplet; finally, the edge loss function is constructed according to the total score of the positive sample triplet and the corresponding total score of the negative sample triplet , and use the edge loss function to jointly update and optimize the parameters of the knowledge representation learning algorithm and the neural network model.
  • the knowledge representation learning algorithm of the determined entity vector representation and relation vector representation is the knowledge representation learning model, and the neural network model determined by the parameters is Axiom Model.
  • Step 4 After the knowledge representation learning model and the axiom model trained in step 3, the inconsistent reasoning of the knowledge graph can be realized. Given a triplet, calculate the final score for the current triplet, if it is lower than the threshold, the triplet is judged to be correct, if it is higher than the threshold, it is considered that the triplet is inconsistent; for each axiom module, calculate the probability of conforming to the axiom, corresponding to the The probability of is higher than the threshold value, indicating that the triplet conforms to the corresponding axiom consistency, otherwise there is inconsistency on the axiom.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A neural network-based knowledge graph inconsistency reasoning method, comprising the following steps: performing representation learning on a triplet by using a knowledge representation learning algorithm to obtain entity representations and a relationship representation and calculating a representation score; then using the entity representations and the relationship representation as an input of a neural network; modeling an axiom by means of the neural network by using the triplet to learn parameters of the neural network used for representing the corresponding axiom, thereby obtaining an axiom model; obtaining an axiom prediction value of the triplet by using the axiom model; and implementing determination of inconsistency of the triplet and the corresponding axiom on the basis of the representation score and the axiom prediction value of the triplet. In the method, there is no need to give ontology information, an inconsistency axiom is learned by using a neural network, and whether there is inconsistency in a triplet and whether there is inconsistency on the given axiom are determined by means of a knowledge representation learning algorithm and the neural network.

Description

一种基于神经网络的知识图谱不一致性推理方法A Neural Network-based Inconsistency Reasoning Method for Knowledge Graphs 技术领域technical field
本发明属于知识图谱和神经网络领域,具体涉及一种基于神经网络的知识图谱不一致性推理方法。The invention belongs to the field of knowledge graphs and neural networks, and in particular relates to a knowledge graph inconsistency reasoning method based on neural networks.
背景技术Background technique
知识图谱是将知识结构化形成的知识系统,已被广泛应用到搜索引擎、推荐系统、问答系统等知识驱动的任务。为了高效地存储及利用知识,人们使用人工标注、半自动化或自动化的方式构建了面向开放域和垂直域的大规模知识图谱。经典的知识图谱如Wikidata、Freebase以及DBpedia等,使用三元组的形式存储实体和实之间的关系。每个三元组对应一条知识,例如(中国,首都是,北京)表示中国的首都是北京,其中“中国”被称作头实体,“北京”被称作尾实体,“首都是”被称为关系。Knowledge graph is a knowledge system formed by structuring knowledge, and has been widely used in knowledge-driven tasks such as search engines, recommendation systems, and question answering systems. In order to store and utilize knowledge efficiently, large-scale knowledge graphs for open and vertical domains have been constructed using manual annotation, semi-automatic or automated methods. Classical knowledge graphs, such as Wikidata, Freebase, and DBpedia, use triples to store relationships between entities and entities. Each triplet corresponds to a piece of knowledge, for example (China, the capital is, Beijing) means that the capital of China is Beijing, where "China" is called the head entity, "Beijing" is called the tail entity, and "the capital is" is called for relationship.
由于知识图谱的构建常通过自动化或半自动化的手段,已构建图谱可能存在知识不完整,知识错误等质量问题。针对知识不完整的问题,许多知识图谱表示学习算法被提出用于预测新知识补全已有知识图谱,主要有通过给定实体和关系预测缺失的实体,或给定两个实体预测它们之间可能存在的关系。针对知识错误的问题,包含不符合公理以及符合公理但知识内容不正确两种情况,主要通过知识不一致性推理实现错误知识的检测,以便进行错误更正。Since the construction of knowledge graphs is often automated or semi-automated, the constructed graphs may have quality problems such as incomplete knowledge and knowledge errors. For the problem of incomplete knowledge, many knowledge graph representation learning algorithms have been proposed to predict new knowledge to complement existing knowledge graphs, mainly predicting missing entities by given entities and relationships, or predicting between two entities given two entities. possible relationship. Aiming at the problem of knowledge error, it includes two situations: not conforming to the axioms and conforming to the axioms but the knowledge content is incorrect, mainly through knowledge inconsistency reasoning to realize the detection of wrong knowledge in order to correct the errors.
知识图谱不一致性推理旨在检测错误的知识,它包含两类任务,一是检测一个三元组是否一致,二是检测三元组的某个公理是否一致,本质上 都是对一个三元组进行二分类。首先,已有的对知识图谱中三元组进行不一致性推理的工作都需要借助知识库本体信息,或自定义一些模板。现有知识库通常没有定义好的本体,通过专家人工定义不仅时间成本高并且会产生定义的本体不完整的问题。其次,当前知识表示学习模型包括距离模型、语义模型以及神经网络模型等,将知识图谱嵌入到低维向量空间,将三元组中的实体和关系表示成低维向量。The purpose of knowledge graph inconsistency reasoning is to detect wrong knowledge. It contains two types of tasks, one is to detect whether a triple is consistent, and the other is to detect whether an axiom of a triple is consistent. Perform binary classification. First of all, the existing work on inconsistent reasoning of triples in the knowledge graph requires the knowledge base ontology information, or some custom templates. Existing knowledge bases usually do not have well-defined ontologies, and manual definition by experts is not only time-intensive, but also leads to the problem of incomplete ontologies. Secondly, the current knowledge representation learning models include distance models, semantic models, and neural network models, etc., which embed knowledge graphs into low-dimensional vector spaces, and represent entities and relationships in triples as low-dimensional vectors.
以上这些方法通过向量计算得到三元组的得分从而判断一条知识是否正确,对应了三元组分类任务。同时,这些方法体现了知识表示学习算法和神经网络在知识图谱推理任务上具有优势。但这些方法只能对三元组进行一致性判断,无法细粒度判别三元组对应的公理是否一致,因此亟需一种通过检测三元组的某个公理是否一致来实现错误知识的检测,以便进行错误更正。The above methods obtain the score of the triplet through vector calculation to determine whether a piece of knowledge is correct, which corresponds to the triplet classification task. At the same time, these methods reflect the advantages of knowledge representation learning algorithms and neural networks in knowledge graph reasoning tasks. However, these methods can only judge the consistency of triples, and cannot judge whether the axioms corresponding to triples are consistent in fine-grained manner. Therefore, it is urgent to detect whether a certain axiom of triples is consistent to realize the detection of wrong knowledge. for error correction.
发明内容SUMMARY OF THE INVENTION
本发明的目的就是提供一种基于神经网络的知识图谱不一致性推理方法,该方法不需要给定本体信息,利用神经网络学习不一致性公理,通过知识表示学习算法和神经网络判断一个三元组是否存在不一致性,在给定公理上是否存在不一致性。The purpose of the present invention is to provide a knowledge graph inconsistency reasoning method based on neural network, which does not require given ontology information, uses neural network to learn the inconsistency axiom, and judges whether a triplet is a triple through knowledge representation learning algorithm and neural network. There is an inconsistency, whether there is an inconsistency on the given axioms.
为实现上述发明目的,本发明提供以下技术方案:In order to realize the above-mentioned purpose of the invention, the present invention provides the following technical solutions:
一种基于神经网络的知识图谱不一致性推理方法,包括以下步骤:A neural network-based knowledge graph inconsistency reasoning method, comprising the following steps:
将三元组的实体表示和关系表示作为输入,利用知识表示学习算法对知识图谱中的三元组的实体表示和关系表示进行学习,同时计算三元组的表示得分,将三元组的实体表示和关系表示作为神经网络的输入,利用三元组通过神经网络对公理进行建模以学习用来表示相应的公理的神经网 络的参数得到公理模型,利用公理模型获得三元组的公理预测值,基于三元组的表示得分和公理预测值实现三元组和对应公理不一致性的判断。Taking the entity representation and relationship representation of the triplet as input, the knowledge representation learning algorithm is used to learn the entity representation and relationship representation of the triplet in the knowledge graph, and the representation score of the triplet is calculated at the same time. Representation and relational representation are used as the input of the neural network, using triples to model the axioms through the neural network to learn the parameters of the neural network used to represent the corresponding axioms to obtain the axiom model, and using the axiomatic model to obtain the axiom prediction value of the triples , based on the representation score of the triplet and the axiom prediction value to realize the judgment of the inconsistency of the triplet and the corresponding axiom.
在进行知识表示学习算法进行三元组表示学习时,将知识图谱嵌入到低维向量空间,把实体和关系表示成向量,通过知识表示学习算法保留三元组的结构信息,给定一个三元组(s,r,o),可选择任意一种知识表示学习算法保留结构信息,如TransE算法、DistMult算法等。输入为实体和关系的向量表示,输出为当前三元组的表示得分,分数f r的具体计算分别为:TransE:f r(s,o)=||s+r-o||;DistMult:f r(s,o)=-s TM ro,其中M r为对角矩阵,s,r,o分别为s,r,o的向量表示。正确三元组f r(s,o)的值低于错误三元组。 When the knowledge representation learning algorithm is used for triple representation learning, the knowledge graph is embedded in a low-dimensional vector space, entities and relationships are represented as vectors, and the structure information of triples is preserved through the knowledge representation learning algorithm. Given a triple Group (s,r,o), you can choose any knowledge representation learning algorithm to retain structural information, such as TransE algorithm, DistMult algorithm, etc. The input is the vector representation of entities and relationships, and the output is the representation score of the current triplet. The specific calculations of the score fr are: TransE : fr (s,o)=||s+ro||; DistMult : fr (s, o)=-s T M r o, where M r is a diagonal matrix, and s, r, and o are vector representations of s, r, and o, respectively. The value of the correct triple f r (s, o) is lower than that of the error triple.
通过神经网络建模公理模型,为每个神经网络学习不同的参数,每组参数被认为是对应公理的特征。知识表示学习算法和神经网络同时进行训练使得在学习向量表示和神经网络参数时,既能编码三元组的结构信息又可以同时使三元组符合相关公理。其中,表示学习算法可以是任意一种知识图谱表示学习算法,神经网络进行二元分类输出每个公理满足一致性的概率。训练过程中神经网络的具体输入来自于当前三元组的元素,或通过三元组查找到的相关三元组中的元素。首先分析对应公理需要考虑的来自三元组或相关三元组中的元素;再根据公理的假设,通过已有三元组符合公理约束将已有三元组及其相关三元组的元素作为神经网络的正样本,通过封闭假设构造每个神经网络模块的负样本,这样的设置使得模型不需要本体信息而仅利用知识图谱中存在的三元组就可以学习公理。Through neural network modeling axiom models, different parameters are learned for each neural network, and each set of parameters is considered as a feature of the corresponding axiom. The knowledge representation learning algorithm and the neural network are trained at the same time, so that when learning the vector representation and the neural network parameters, it can not only encode the structural information of the triplet, but also make the triplet conform to the relevant axioms at the same time. The representation learning algorithm can be any knowledge graph representation learning algorithm, and the neural network performs binary classification to output the probability that each axiom satisfies the consistency. The specific input of the neural network during the training process comes from the elements of the current triplet, or the elements in the related triplet found through the triplet. Firstly, the elements from triples or related triples that need to be considered for the corresponding axioms are analyzed; then, according to the assumptions of the axioms, the elements of the existing triples and their related triples are regarded as neural networks through the existing triples conforming to the axiom constraints. The positive samples of each neural network module are constructed through closed assumptions. This setting allows the model to learn the axioms only by using the triples existing in the knowledge graph without ontology information.
上述基于神经网络的知识图谱不一致性推理方法中,利用三元组通过神经网络对公理进行建模时,In the above-mentioned neural network-based knowledge graph inconsistency reasoning method, when using triples to model the axioms through the neural network,
首先,从OWL2本体语言定义的可用于不一致性检测的公理中,选 择被考虑的公理并分析知识图谱中发生每个公理对应的不一致性的约束或条件;First, from the axioms defined by the OWL2 ontology language that can be used for inconsistency detection, select the axioms to be considered and analyze the constraints or conditions of the inconsistency corresponding to each axiom in the knowledge graph;
然后,构建每个公理对应的神经网络模型,同时根据公理对应不一致性的约束或条件构建每个神经网络模型的正样本,并基于正样本构建负样本;Then, the neural network model corresponding to each axiom is constructed, and the positive samples of each neural network model are constructed according to the constraints or conditions of the inconsistency corresponding to the axioms, and the negative samples are constructed based on the positive samples;
接下来,针对每个公理对应的神经网络模型需要考虑的三元组的元素,将相应的实体表示和关系表示拼接后输入至神经网络模型,计算得到每个公理对应的预测值,并基于所有公理对应的预测值得到神经网络模型的预测得分,综合知识表示学习算法计算的表示得分和预测得分,得到三元组的总得分;Next, for the elements of triples that need to be considered in the neural network model corresponding to each axiom, the corresponding entity representation and relationship representation are spliced and input into the neural network model, and the predicted value corresponding to each axiom is calculated and calculated based on all The prediction value corresponding to the axiom is obtained by the prediction score of the neural network model, and the representation score and prediction score calculated by the knowledge representation learning algorithm are combined to obtain the total score of the triplet;
最后,依据正样本三元组总分和对应的负样本三元组总得分构建边缘损失函数,并利用边缘损失函数联合更新优化知识表示学习算法和神经网络模型参数以及实体表示和关系表示,优化结束后,最终学习到参数确定的神经网络模型为公理模型,以及知识图谱中确定的实体表示和关系表示。Finally, an edge loss function is constructed according to the total score of the positive sample triplet and the corresponding total score of the negative sample triplet, and the edge loss function is used to jointly update the optimized knowledge representation learning algorithm and neural network model parameters, as well as entity representation and relation representation. After the end, the neural network model determined by the parameters is finally learned as the axiomatic model, as well as the entity representation and relation representation determined in the knowledge graph.
上述基于神经网络的知识图谱不一致性推理方法中,在选择用于不一致性检测的公理时,先根据公理定义中提到的需要满足的条件或约束,判断公理是否可以用于不一致性检测,再从可用于不一致性检测的公理中选择公理,将选择的公理对应的条件或约束对标到知识图谱中三元组的相关元素、三元组元素之间的关联性,进而实现构建公理的正样本。In the above-mentioned neural network-based knowledge graph inconsistency reasoning method, when selecting axioms for inconsistency detection, first determine whether the axioms can be used for inconsistency detection according to the conditions or constraints mentioned in the axiom definition that need to be satisfied, and then Select axioms from the axioms that can be used for inconsistency detection, and benchmark the conditions or constraints corresponding to the selected axioms to the related elements of the triples in the knowledge graph and the correlation between the elements of the triples, and then realize the positive construction of the axioms. sample.
上述基于神经网络的知识图谱不一致性推理方法中,在对三元组不一致性进行判断时,为每个关系设定关系阈值,根据优化结束时确时确定的实体表示和关系表示,利用知识表示学习算法和公理模型计算三元组的总得分,当三元组的总得分低于关系阈值时,则认为三元组为一个正确的三元组,反之为一个错误的三元组即存在不一致性。In the above-mentioned neural network-based knowledge graph inconsistency reasoning method, when judging the inconsistency of triples, a relationship threshold is set for each relationship, and the knowledge representation is used according to the entity representation and relationship representation determined at the end of the optimization. The learning algorithm and the axiomatic model calculate the total score of the triplet. When the total score of the triplet is lower than the relation threshold, the triplet is considered to be a correct triplet, otherwise it is a wrong triplet, which means there is inconsistency sex.
上述基于神经网络的知识图谱不一致性推理方法中,在对三元组对应公理不一致性进行判断时,为每个公理设定一个公理阈值,根据生成知识表示学习模型时确定的实体向量表示和关系向量表示,利用每个公理模型计算三元组针对公理的预测值,当三元组针对公理的预测值低于对应的公理阈值时,则认为三元组在该公理上存在不一致。In the above-mentioned neural network-based knowledge graph inconsistency reasoning method, when judging the inconsistency of the axioms corresponding to the triples, an axiom threshold is set for each axiom, and the entity vector representation and relationship determined when the knowledge representation learning model is generated is used. Vector representation, using each axiom model to calculate the predicted value of the triple for the axiom, when the predicted value of the triple for the axiom is lower than the corresponding axiom threshold, the triple is considered inconsistent on the axiom.
在一个实施方式中,针对三元组(s,r,o),选中的公理包括:In one embodiment, for the triple (s,r,o), the selected axioms include:
对象属性域(Object Property Domain),简称domain公理,定义关系r的头实体s类型应该符合相应的类别;The Object Property Domain, referred to as the domain axiom, defines the type of the head entity s of the relationship r should conform to the corresponding category;
对象属性范围(Object Property Range),简称range公理,定义关系r的尾实体o类型应该符合相应的类别;Object Property Range, referred to as the range axiom, defines that the type of the tail entity o of the relation r should conform to the corresponding category;
不相交对象属性(Disjoint Object Properties),简称disjoint公理,定义关系r和关系r 1互斥,三元组(s,r,o)和三元组(s,r 1,o)应该不同时存在于一个知识图谱中; Disjoint Object Properties, referred to as the disjoint axiom, defines that relation r and relation r 1 are mutually exclusive, and triples (s, r, o) and triples (s, r 1 , o) should not exist at the same time in a knowledge graph;
不可逆对象属性(Irreflexive Object Property),简称irreflexive公理,需两步判断,先检查关系是否是反自反,再判断头实体和尾实体是否相等,如果关系r是反自反,实体无法通过该关系指向自身即s=o;Irreflexive Object Property, referred to as the irreversible axiom, requires two-step judgment. First, check whether the relationship is reflexive, and then determine whether the head entity and the tail entity are equal. If the relationship r is anti-reflexive, the entity cannot pass the relationship. pointing to itself i.e. s=o;
非对称对象属性(Asymmetric Object Property),简称asymmetric公理,需两步判断,先检查关系是否是反对称,再查找关系为r的三元组(s 1,r,o 1)。若判断是否存在s 1=o和o 1=s,如果关系是反对称的,则两个三元组(s,r,o)和(o,r,s)不应该同时存在一个知识图谱中。 Asymmetric Object Property, referred to as asymmetric axiom, requires two-step judgment. First, check whether the relationship is antisymmetric, and then find the triple (s 1 ,r,o 1 ) whose relationship is r. If judging whether s 1 =o and o 1 =s exist, if the relationship is antisymmetric, then the two triples (s,r,o) and (o,r,s) should not exist in a knowledge graph at the same time .
神经网络模块的输入为当前三元组或与当前三元组相关三元组的元素,输出为符合对应公理的概率,神经网络模块的计算分别如下:The input of the neural network module is the element of the current triplet or the triplet related to the current triplet, and the output is the probability that conforms to the corresponding axiom. The calculations of the neural network module are as follows:
domain公理预测值:P dm(s,r,o)=g(W 1·[r;s]+b 1) Domain axiom prediction: P dm (s,r,o)=g(W 1 ·[r; s]+b 1 )
range公理预测值:P rg(s,r,o)=g(W 2·[r;o]+b 2) Range axiom prediction: P rg (s,r,o)=g(W 2 ·[r; o]+b 2 )
disjoint公理预测值:P dis(s,r,o,s,r 1,o)=g(W 3·[r;r 1]+b 3) Disjoint axiom prediction: P dis (s,r,o,s,r 1 ,o)=g(W 3 ·[r;r 1 ]+b 3 )
irreflexive公理预测值:P irre(s,r,o)=g(W 4·[s;r;o]+b 4) The predicted value of the irreflexive axiom: P irre (s,r,o)=g(W 4 ·[s;r;o]+b 4 )
asymmetric公理预测值:The asymmetric axiom predicts the value:
P asym(s,r,o,s 1,r,o 1)=g(W 5·[s;r;o;s 1;r;o 1]+b 5); Pasym (s,r,o,s 1 ,r,o 1 )=g(W 5 ·[s;r;o;s 1 ;r;o 1 ]+b 5 );
则三元组的总得分为:Then the total score of the triplet is:
f(s,r,o)=f r(s,o)+α(1-P dm)+β(1-P rg)+∈(1-P irre)+ζ(1-P dis)+η(1-P asym) f(s,r,o)=f r (s,o)+α(1-P dm )+β(1-P rg )+∈(1-P irre )+ζ(1-P dis )+η (1-P asym )
其中,s、s 1、r、r 1、o、o 1分别表示实体s、实体s 1、关系r、关系r 1、实体o、实体o 1的学习表示向量,符号;表示拼接操作,g()表示sigmoid函数,W 1,W 2,W 3,W 4,W 5表示权重向量,b 1,b 2,b 3,b 4,b 5为偏置,α,β,∈,ζ,η为表示权重的超参数,f r(s,o)表示利用知识表示学习模型得到的三元组的表示得分。 Among them, s, s 1 , r, r 1 , o, o 1 respectively represent the learning representation vector, symbol of entity s, entity s 1 , relation r, relation r 1 , entity o, and entity o 1 ; represent the splicing operation, g () represents the sigmoid function, W 1 , W 2 , W 3 , W 4 , W 5 represent the weight vector, b 1 , b 2 , b 3 , b 4 , b 5 are biases, α, β, ∈, ζ, η is a hyperparameter representing the weight, and fr (s, o) represents the representation score of the triplet obtained by using the knowledge representation learning model.
针对正确的三元组,每种公理模型输出的公理概率值相对高于错误的三元组,以上得分函数确保正样本的三元组总分数低于负样本的三元组。构建的边缘损失函数为:For correct triples, the axiom probability value output by each axiomatic model is relatively higher than that of wrong triples, and the above scoring function ensures that the total score of triples of positive samples is lower than that of triples of negative samples. The constructed marginal loss function is:
Figure PCTCN2021116777-appb-000001
Figure PCTCN2021116777-appb-000001
其中,F表示正样本集合,F′为负样本集合,f(s,r,o)表示三元组(s,r,o)的总得分,f(s′,r′,o′)表示三元组(s′,r′,o′)的总得分。Among them, F represents the set of positive samples, F' represents the set of negative samples, f(s, r, o) represents the total score of the triplet (s, r, o), and f(s', r', o') represents the The total score for the triple (s',r',o').
基于domain公理、range公理、disjoint公理、irreflexive公理以及asymmetric公理,在对三元组进行不一致性判断时,给定一个三元组(s,r,o),利用训练阶段学习出实体和关系的向量表示以及神经网络模型的参数,计算最终得分f(s,r,o),以及各个公理模块包括P dm,P rg,P irre,P dis,P asym的值。针对三元组一致性判断,为每个关系引入一个阈值,当得分f(s,r,o)低于阈值表示三元组为一个正确的三元组,反 之则为错误的三元组,即存在不一致性。为进一步区分三元组是否在当前考虑的公理上存在不一致性,根据验证集为每个公理模型引入一个阈值,如果P dm,P rg,P irre,P dis,P asym均低于对应公理阈值,则表明该三元组分别在domain公理、range公理、disjoint公理、irreflexive公理以及asymmetric公理上存在不一致性。 Based on the domain axiom, range axiom, disjoint axiom, irreflexive axiom and asymmetric axiom, when judging the inconsistency of triples, a triple (s, r, o) is given, and the training phase is used to learn entities and relationships. The vector representation and the parameters of the neural network model, calculate the final score f(s,r,o), and the values of each axiom module including P dm , P rg , P irre , P dis , and P asym . For the consistency judgment of triples, a threshold is introduced for each relationship. When the score f(s,r,o) is lower than the threshold, it means that the triplet is a correct triplet, otherwise it is a wrong triplet, That is, there is inconsistency. In order to further distinguish whether the triples have inconsistencies in the axioms currently considered, a threshold is introduced for each axiom model according to the validation set. If P dm , P rg , P irre , P dis , and P asym are all lower than the corresponding axiom thresholds , it indicates that the triplet has inconsistency in domain axiom, range axiom, disjoint axiom, irreflexive axiom and asymmetric axiom respectively.
与现有技术相比,本发明具有的有益效果至少包括:Compared with the prior art, the beneficial effects of the present invention at least include:
(1)上述知识图谱不一致性推理方法不需要定义好的本体信息,仅利用知识图谱中已有的三元组进行公理学习,使得在没有本体定义或本体定义不完整的知识图谱上也可以捕获部分公理并进行不一致性推理,极大地减少了人工成本。同时,知识表示学习模型和公理模型可以用于任意知识图谱中。(1) The above-mentioned knowledge graph inconsistency reasoning method does not require well-defined ontology information, and only uses the existing triples in the knowledge graph for axiom learning, so that it can also be captured on knowledge graphs without ontology definitions or incomplete ontology definitions. Part of the axioms and inconsistency reasoning, greatly reducing labor costs. At the same time, the knowledge representation learning model and the axiomatic model can be used in any knowledge graph.
(2)相比于知识表示学习算法只考虑结构信息,该方法将公理学习转换成神经网络参数学习,利用知识表示学习算法和神经网络联合训练让模型既要保留结构信息,又能学习不一致性相关的公理。公理的学习让模型不仅能检测到不一致的三元组,而且细粒度地检测考虑的几种公理是否存在不一致性。这样可以更好地实现不一致性推理,为后续修正三元组提供方便。(2) Compared with the knowledge representation learning algorithm that only considers structural information, this method converts axiom learning into neural network parameter learning, and uses the knowledge representation learning algorithm and neural network joint training to allow the model to retain structural information and learn inconsistency. related axioms. Axiom learning allows the model to detect not only inconsistent triples, but also fine-grained detection of inconsistencies for several axioms under consideration. In this way, inconsistency inference can be better implemented, and it is convenient for subsequent correction of triples.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动前提下,还可以根据这些附图获得其他附图。In order to illustrate the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts.
图1是基于神经网络的知识图谱不一致性推理方法的流程图。Figure 1 is a flowchart of a neural network-based knowledge graph inconsistency reasoning method.
具体实施方式detailed description
为使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例对本发明进行进一步的详细说明。应当理解,此处所描述的具体实施方式仅仅用以解释本发明,并不限定本发明的保护范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, and do not limit the protection scope of the present invention.
图1是基于神经网络的知识图谱不一致性推理方法的流程图。如图1所示,针对给定的包含大量三元组(s,r,o)的知识图谱,知识图谱不一致性推理方法包括以下步骤:Figure 1 is a flowchart of a neural network-based knowledge graph inconsistency reasoning method. As shown in Figure 1, for a given knowledge graph containing a large number of triples (s, r, o), the knowledge graph inconsistency reasoning method includes the following steps:
步骤1,从OWL2对象属性公理里可用于不一致检测的公理中选择以下五种公理,分析五种公理在OWL2中描述及判断条件如下:Step 1, select the following five axioms from the axioms that can be used for inconsistency detection in the OWL2 object attribute axioms, and analyze the description and judgment conditions of the five axioms in OWL2 as follows:
Figure PCTCN2021116777-appb-000002
Figure PCTCN2021116777-appb-000002
步骤2,根据判断条件得到每个公里对应的神经网络的训练样本,即根据公理的判断条件确定每个神经网络的具体输入,下面给一个样例知识图谱说明每个公理模型的输入:Step 2: Obtain the training samples of the neural network corresponding to each kilometer according to the judgment conditions, that is, determine the specific input of each neural network according to the judgment conditions of the axioms. The following is a sample knowledge graph to illustrate the input of each axiom model:
s 1 s 1 r 1 r 1 o 1 o 1
s 2 s 2 r 2 r 2 o 2 o 2
s 1 s 1 r 3 r 3 o 1 o 1
s 2 s 2 r 1 r 1 o 2 o 2
s 1 s 1 r 1 r 1 s 1 s 1
o 1 o 1 r 3 r 3 s 1 s 1
给定的样例图谱包含6个三元组,包含的实体有(s 1,s 2,o 1,o 2),包含的关系有(r 1,r 2,r 3)。训练时,domain公理关注关系和头实体,该模块输入 为(r 1,s 1),(r 2,s 2),(r 3,s 1),(r 1,s 2),(r 3,o 1);range公理关注关系和尾实体,该模块输入为(r 1,o 1),(r 2,o 2),(r 3,o 1),(r 1,o 2),(r 1,s 1),(r 3,s 1);disjoint公理关注两个头实体和尾实体均相同的三元组中关系的相关性,该模块输入为(s 1,r 1,o 1,s 1,r 3,o 1),(s 2,r 2,o 2,s 2,r 1,o 2);irreflexive公理关注当前三元组,该模块输入为(s 1,r 1,o 1),(s 2,r 2,o 2),(s 1,r 3,o 1),(s 2,r 1,o 2),(s 1,r 1,s 1),(o 1,r 3,s 1);asymmetric公理关注关系相同的两个三元组,输入为(s 1,r 1,o 1,s 2,r 1,o 2),(s 1,r 1,o 1,s 1,r 1,s 1),(s 2,r 1,o 2,s 1,r 1,s 1)。知识表示学习算法关注当前三元组,输入为(s 1,r 1,o 1),(s 2,r 2,o 2),(s 1,r 3,o 1),(s 2,r 1,o 2),(s 1,r 1,s 1),(o 1,r 3,s 1)。上述每个模块的输入都可以看作该模块的一个正样本。 The given sample graph contains 6 triples, including entities (s 1 , s 2 , o 1 , o 2 ) and relations (r 1 , r 2 , r 3 ). During training, the domain axiom focuses on relations and head entities, and the module inputs are (r 1 ,s 1 ),(r 2 ,s 2 ),(r 3 ,s 1 ),(r 1 ,s 2 ),(r 3 ,o 1 ); the range axiom concerns relations and tail entities, the module inputs are (r 1 ,o 1 ),(r 2 ,o 2 ),(r 3 ,o 1 ),(r 1 ,o 2 ),( r 1 ,s 1 ),(r 3 ,s 1 ); the disjoint axiom focuses on the correlation of relations in triples where both head and tail entities are the same. The input of this module is (s 1 ,r 1 ,o 1 , s 1 ,r 3 ,o 1 ),(s 2 ,r 2 ,o 2 ,s 2 ,r 1 ,o 2 ); the irreflexive axiom concerns the current triple, and the module input is (s 1 ,r 1 ,o 1 ),(s 2 ,r 2 ,o 2 ),(s 1 ,r 3 ,o 1 ),(s 2 ,r 1 ,o 2 ),(s 1 ,r 1 ,s 1 ),(o 1 , r 3 , s 1 ); the asymmetric axiom concerns two triples with the same relationship, the input is (s 1 ,r 1 ,o 1 ,s 2 ,r 1 ,o 2 ),(s 1 ,r 1 ,o 1 ,s 1 ,r 1 ,s 1 ),(s 2 ,r 1 ,o 2 ,s 1 ,r 1 ,s 1 ). The knowledge representation learning algorithm focuses on the current triple, and the input is (s 1 ,r 1 ,o 1 ), (s 2 ,r 2 ,o 2 ), (s 1 ,r 3 ,o 1 ), (s 2 ,r 1 ,o 2 ), (s 1 ,r 1 ,s 1 ),(o 1 ,r 3 ,s 1 ). The input of each of the above modules can be regarded as a positive sample for that module.
步骤3,联合训练知识表示学习算法和公理对应的神经网络Step 3, jointly train the knowledge representation learning algorithm and the neural network corresponding to the axiom
当构建好每个公理对应的神经网络后,将样本的实体表示和关系表示拼接后输入至神经网络模型,计算得到每个公理对应的预测值,并基于所有公理对应的预测值得到神经网络模型的预测得分,综合三元组的表示得分和公理模型的预测得分,得到三元组的总得分;最后,依据正样本三元组总分和对应的负样本三元组总得分构建边缘损失函数,并利用边缘损失函数联合更新优化知识表示学习算法和神经网络模型参数,优化结束后,确定的实体向量表示和关系向量表示的知识表示学习算法为知识表示学习模型,参数确定的神经网络模型为公理模型。After the neural network corresponding to each axiom is constructed, the entity representation and relation representation of the sample are spliced and input into the neural network model, the predicted value corresponding to each axiom is calculated, and the neural network model is obtained based on the predicted value corresponding to all axioms The prediction score of the triplet is combined with the representation score of the triplet and the prediction score of the axiomatic model to obtain the total score of the triplet; finally, the edge loss function is constructed according to the total score of the positive sample triplet and the corresponding total score of the negative sample triplet , and use the edge loss function to jointly update and optimize the parameters of the knowledge representation learning algorithm and the neural network model. After the optimization, the knowledge representation learning algorithm of the determined entity vector representation and relation vector representation is the knowledge representation learning model, and the neural network model determined by the parameters is Axiom Model.
步骤4,经过步骤3训练好的知识表示学习模型和公理模型,就可以实现知识图谱的不一致性推理。给定一个三元组,为当前三元组计算最终得分,低于阈值判断三元组为正确,高于阈值认为三元组存在不一致;为每个公理模块计算符合该公理的概率,对应公理的概率高于阈值说明三元组符合对应公理一致性,反之在该公理上存在不一致性。Step 4: After the knowledge representation learning model and the axiom model trained in step 3, the inconsistent reasoning of the knowledge graph can be realized. Given a triplet, calculate the final score for the current triplet, if it is lower than the threshold, the triplet is judged to be correct, if it is higher than the threshold, it is considered that the triplet is inconsistent; for each axiom module, calculate the probability of conforming to the axiom, corresponding to the The probability of is higher than the threshold value, indicating that the triplet conforms to the corresponding axiom consistency, otherwise there is inconsistency on the axiom.
以上所述的具体实施方式对本发明的技术方案和有益效果进行了详 细说明,应理解的是以上所述仅为本发明的最优选实施例,并不用于限制本发明,凡在本发明的原则范围内所做的任何修改、补充和等同替换等,均应包含在本发明的保护范围之内。The above-mentioned specific embodiments describe in detail the technical solutions and beneficial effects of the present invention. It should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, additions and equivalent substitutions made within the scope shall be included within the protection scope of the present invention.

Claims (9)

  1. 一种基于神经网络的知识图谱不一致性推理方法,其特征在于,包括以下步骤:A neural network-based knowledge graph inconsistency reasoning method, characterized in that it includes the following steps:
    将三元组的实体表示和关系表示作为输入,利用知识表示学习算法对知识图谱中的三元组的实体表示和关系表示进行学习,同时计算三元组的表示得分,将三元组的实体表示和关系表示作为神经网络的输入,利用三元组通过神经网络对公理进行建模以学习用来表示相应的公理的神经网络的参数得到公理模型,利用公理模型获得三元组的公理预测值,基于三元组的表示得分和公理预测值实现三元组和对应公理不一致性的判断。Taking the entity representation and relationship representation of the triplet as input, the knowledge representation learning algorithm is used to learn the entity representation and relationship representation of the triplet in the knowledge graph, and the representation score of the triplet is calculated at the same time. Representation and relational representation are used as the input of the neural network, using triples to model the axioms through the neural network to learn the parameters of the neural network used to represent the corresponding axioms to obtain the axiom model, and using the axiomatic model to obtain the axiom prediction value of the triples , based on the representation score of the triplet and the axiom prediction value to realize the judgment of the inconsistency of the triplet and the corresponding axiom.
  2. 如权利要求1所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,利用三元组通过神经网络对公理进行建模时,The method for reasoning about inconsistency of knowledge graphs based on neural network according to claim 1, characterized in that, when using triples to model the axioms through a neural network,
    首先,从OWL2本体语言定义的可用于不一致性检测的公理中,选择被考虑的公理并分析知识图谱中发生每个公理对应的不一致性的约束或条件;First, from the axioms defined by the OWL2 ontology language that can be used for inconsistency detection, select the axioms to be considered and analyze the constraints or conditions of the inconsistency corresponding to each axiom in the knowledge graph;
    然后,构建每个公理对应的神经网络模型,同时根据公理对应不一致性的约束或条件构建每个神经网络模型的正样本,并基于正样本构建负样本;Then, the neural network model corresponding to each axiom is constructed, and the positive samples of each neural network model are constructed according to the constraints or conditions of the inconsistency corresponding to the axioms, and the negative samples are constructed based on the positive samples;
    接下来,针对每个公理对应的神经网络模型需要考虑的三元组的元素,将相应实体表示和关系表示拼接后输入至神经网络模型,计算得到每个公理对应的预测值,并基于所有公理对应的预测值得到神经网络模型的预测得分,综合知识表示学习算法计算的表示得分和预测得分,得到三元组的总得分;Next, for the elements of the triplet that the neural network model corresponding to each axiom needs to consider, the corresponding entity representation and relationship representation are spliced and input into the neural network model, and the predicted value corresponding to each axiom is calculated and calculated based on all axioms. The corresponding prediction value is obtained by the prediction score of the neural network model, and the representation score and prediction score calculated by the knowledge representation learning algorithm are combined to obtain the total score of the triplet;
    最后,依据正样本三元组总分和对应的负样本三元组总得分构建边缘 损失函数,并利用边缘损失函数联合更新优化知识表示学习算法和神经网络模型参数以及实体表示和关系表示,优化结束后,最终学习到参数确定的神经网络模型为公理模型,以及知识图谱中确定的实体表示和关系表示。Finally, an edge loss function is constructed according to the total score of the positive sample triplet and the corresponding total score of the negative sample triplet, and the edge loss function is used to jointly update the optimized knowledge representation learning algorithm and neural network model parameters, as well as entity representation and relation representation. After the end, the neural network model determined by the parameters is finally learned as the axiomatic model, as well as the entity representation and relation representation determined in the knowledge graph.
  3. 如权利要求2所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,在选择用于不一致性检测的公理时,先根据公理定义中提到的需要满足的条件或约束,判断公理是否可以用于不一致性检测,再从可用于不一致性检测的公理中选择公理,将选择的公理对应的条件或约束对标到知识图谱中三元组的相关元素、三元组元素之间的关联性,进而实现构建公理的正样本。The method for reasoning about inconsistency of knowledge graphs based on neural network according to claim 2, characterized in that, when selecting axioms for inconsistency detection, the axioms are first judged according to the conditions or constraints mentioned in the axiom definitions that need to be satisfied. Whether it can be used for inconsistency detection, and then select the axioms from the axioms that can be used for inconsistency detection, and mark the conditions or constraints corresponding to the selected axioms to the related elements of the triplet in the knowledge graph, and the relationship between the triplet elements. Correlation, and then realize the positive sample of the construction axiom.
  4. 如权利要求2所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,在对三元组不一致性进行判断时,为每个关系设定关系阈值,根据优化结束时确定的实体表示和关系表示,利用知识表示学习算法和公理模型计算三元组的总得分,当三元组的总得分低于关系阈值时,则认为三元组为一个正确的三元组,反之为一个错误的三元组即存在不一致性。The neural network-based knowledge graph inconsistency reasoning method according to claim 2, wherein when judging the inconsistency of triples, a relationship threshold is set for each relationship, and the entity representation determined at the end of the optimization is based on and relational representation, using the knowledge representation learning algorithm and axiomatic model to calculate the total score of the triplet, when the total score of the triplet is lower than the relational threshold, the triplet is considered a correct triplet, otherwise it is an error The triplet of , there is inconsistency.
  5. 如权利要求2所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,在对三元组对应公理不一致性进行判断时,为每个公理设定一个公理阈值,利用每个公理模型计算三元组针对公理的预测值,当三元组针对公理的预测值低于对应的公理阈值时,则认为三元组在该公理上存在不一致。The method for reasoning about inconsistency of knowledge graphs based on neural network according to claim 2, wherein when judging the inconsistency of axioms corresponding to triples, an axiom threshold is set for each axiom, and each axiom model is used Calculate the predicted value of the triplet for the axiom. When the predicted value of the triplet for the axiom is lower than the corresponding axiom threshold, it is considered that the triplet is inconsistent on the axiom.
  6. 如权利要求2所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,针对三元组(s,r,o),选中的公理包括:The method for reasoning about inconsistency of knowledge graph based on neural network according to claim 2, wherein, for triples (s, r, o), the selected axioms include:
    对象属性域,简称domain公理,定义关系r的头实体s类型应该符合相应的类别;The object attribute domain, referred to as the domain axiom, defines that the type of the head entity s of the relationship r should conform to the corresponding category;
    对象属性范围,简称range公理,定义关系r的尾实体o类型应该符合 相应的类别;The range of object attributes, referred to as the range axiom, defines that the type of the tail entity o of the relation r should conform to the corresponding category;
    不相交对象属性,简称disjoint公理,定义关系r和关系r 1互斥,三元组(s,r,o)和三元组(s,r 1,o)应该不同时存在于一个知识图谱中; The disjoint object attribute, referred to as the disjoint axiom, defines that relation r and relation r 1 are mutually exclusive, and triples (s, r, o) and triples (s, r 1 , o) should not exist in a knowledge graph at the same time ;
    不可逆对象属性,简称irreflexive公理,需两步判断,先检查关系是否是反自反,再判断头实体和尾实体是否相等,如果关系r是反自反,实体无法通过该关系指向自身即s=o;The irreversible object property, referred to as the irreversible axiom, requires two-step judgment. First, check whether the relationship is anti-reflexive, and then determine whether the head entity and the tail entity are equal. If the relationship r is anti-reflexive, the entity cannot point to itself through the relationship, that is, s= o;
    非对称对象属性,简称asymmetric公理,需两步判断,先检查关系是否是反对称,再查找关系为r的三元组(s 1,r,o 1),判断是否存在s 1=o和o 1=s,如果关系是反对称的,则两个三元组(s,r,o)和(o,r,s)不应该同时存在一个知识图谱中。 The asymmetric object property, referred to as the asymmetric axiom, requires two-step judgment. First, check whether the relationship is antisymmetric, and then find the triple (s 1 , r, o 1 ) with the relationship r, and determine whether there is s 1 = o and o 1 = s, if the relation is antisymmetric, the two triples (s,r,o) and (o,r,s) should not exist in a knowledge graph at the same time.
  7. 如权利要求6所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,神经网络模块的输入为当前三元组或与当前三元组相关三元组的元素,输出为符合对应公理的概率,神经网络模块的计算分别如下:The method for reasoning about inconsistency of knowledge graphs based on neural network according to claim 6, wherein the input of the neural network module is the element of the current triplet or the triplet related to the current triplet, and the output is conforming to the corresponding axiom The probability of , the calculation of the neural network module is as follows:
    domain公理预测值:P dm(s,r,o)=g(W 1·[r;s]+b 1) Domain axiom prediction: P dm (s,r,o)=g(W 1 ·[r; s]+b 1 )
    range公理预测值:P rg(s,r,o)=g(W 2·[r;o]+b 2) Range axiom prediction: P rg (s,r,o)=g(W 2 ·[r; o]+b 2 )
    disjoint公理预测值:P dis(s,r,o,s,r 1,o)=g(W 3·[r;r 1]+b 3) Disjoint axiom prediction: P dis (s,r,o,s,r 1 ,o)=g(W 3 ·[r;r 1 ]+b 3 )
    irreflexive公理预测值:P irre(s,r,o)=g(W 4·[s;r;o]+b 4) The predicted value of the irreflexive axiom: P irre (s,r,o)=g(W 4 ·[s;r;o]+b 4 )
    asymmetric公理预测值:The asymmetric axiom predicts the value:
    P asym(s,r,o,s 1,r,o 1)=g(W 5·[s;r;o;s 1;r;o 1]+b 5); Pasym (s,r,o,s 1 ,r,o 1 )=g(W 5 ·[s;r;o;s 1 ;r;o 1 ]+b 5 );
    则三元组的总得分为:Then the total score of the triplet is:
    f(s,r,o)=f r(s,o)+α(1-P dm)+β(1-P rg)+ε(1-P irre)+ζ(1-P dis)+η(1-P asym) f(s,r,o)=f r (s,o)+α(1-P dm )+β(1-P rg )+ε(1-P irre )+ζ(1-P dis )+η (1-P asym )
    其中,s、s 1、r、r 1、o、o 1分别表示实体s、实体s 1、关系r、关系r 1、实体o、实体o 1的学习表示向量,符号;表示拼接操作,g()表示sigmoid函数, W 1,W 2,W 3,W 4,W 5表示权重向量,b 1,b 2,b 3,b 4,b 5为偏置,α,β,ε,ζ,η为表示权重的超参数,f r(s,o)表示利用知识表示学习模型得到的三元组的表示得分。 Among them, s, s 1 , r, r 1 , o, o 1 respectively represent the learning representation vector, symbol of entity s, entity s 1 , relation r, relation r 1 , entity o, and entity o 1 ; represent the splicing operation, g () represents the sigmoid function, W 1 , W 2 , W 3 , W 4 , W 5 represent the weight vector, b 1 , b 2 , b 3 , b 4 , b 5 are biases, α, β, ε, ζ, η is a hyperparameter representing the weight, and fr (s, o) represents the representation score of the triplet obtained by using the knowledge representation learning model.
  8. 如权利要求1所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,构建的边缘损失函数为:The neural network-based knowledge graph inconsistency reasoning method according to claim 1, wherein the constructed edge loss function is:
    Figure PCTCN2021116777-appb-100001
    Figure PCTCN2021116777-appb-100001
    其中,F表示正样本集合,F′为负样本集合,f(s,r,o)表示三元组(s,r,o)的总得分,f(s′,r′,o′)表示三元组(s′,r′,o′)的总得分。Among them, F represents the set of positive samples, F' represents the set of negative samples, f(s, r, o) represents the total score of the triplet (s, r, o), and f(s', r', o') represents the The total score for the triple (s',r',o').
  9. 如权利要求1所述的基于神经网络的知识图谱不一致性推理方法,其特征在于,所述知识表示学习算法包括TransE算法或DistMult算法。The neural network-based inconsistency reasoning method for knowledge graphs according to claim 1, wherein the knowledge representation learning algorithm comprises a TransE algorithm or a DistMult algorithm.
PCT/CN2021/116777 2020-09-16 2021-09-06 Neural network–based knowledge graph inconsistency reasoning method WO2022057671A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010973433.0A CN112100403A (en) 2020-09-16 2020-09-16 Knowledge graph inconsistency reasoning method based on neural network
CN202010973433.0 2020-09-16

Publications (1)

Publication Number Publication Date
WO2022057671A1 true WO2022057671A1 (en) 2022-03-24

Family

ID=73759649

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/116777 WO2022057671A1 (en) 2020-09-16 2021-09-06 Neural network–based knowledge graph inconsistency reasoning method

Country Status (2)

Country Link
CN (1) CN112100403A (en)
WO (1) WO2022057671A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114741460A (en) * 2022-06-10 2022-07-12 山东大学 Knowledge graph data expansion method and system based on association between rules
CN116340524A (en) * 2022-11-11 2023-06-27 华东师范大学 Method for supplementing small sample temporal knowledge graph based on relational adaptive network
CN117591657A (en) * 2023-12-22 2024-02-23 宿迁乐享知途网络科技有限公司 Intelligent dialogue management system and method based on AI

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100403A (en) * 2020-09-16 2020-12-18 浙江大学 Knowledge graph inconsistency reasoning method based on neural network
CN112633927B (en) * 2020-12-23 2021-11-19 浙江大学 Combined commodity mining method based on knowledge graph rule embedding
CN113449118B (en) * 2021-06-29 2022-09-20 华南理工大学 Standard document conflict detection method and system based on standard knowledge graph
CN114357192B (en) * 2021-12-31 2024-04-19 海南大学 DIKW-based content integrity modeling and judging method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065668A1 (en) * 2018-08-27 2020-02-27 NEC Laboratories Europe GmbH Method and system for learning sequence encoders for temporal knowledge graph completion
CN111368096A (en) * 2020-03-09 2020-07-03 中国平安人寿保险股份有限公司 Knowledge graph-based information analysis method, device, equipment and storage medium
CN111444305A (en) * 2020-03-19 2020-07-24 浙江大学 Multi-triple combined extraction method based on knowledge graph embedding
CN111582509A (en) * 2020-05-07 2020-08-25 南京邮电大学 Knowledge graph representation learning and neural network based collaborative recommendation method
CN112100403A (en) * 2020-09-16 2020-12-18 浙江大学 Knowledge graph inconsistency reasoning method based on neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065668A1 (en) * 2018-08-27 2020-02-27 NEC Laboratories Europe GmbH Method and system for learning sequence encoders for temporal knowledge graph completion
CN111368096A (en) * 2020-03-09 2020-07-03 中国平安人寿保险股份有限公司 Knowledge graph-based information analysis method, device, equipment and storage medium
CN111444305A (en) * 2020-03-19 2020-07-24 浙江大学 Multi-triple combined extraction method based on knowledge graph embedding
CN111582509A (en) * 2020-05-07 2020-08-25 南京邮电大学 Knowledge graph representation learning and neural network based collaborative recommendation method
CN112100403A (en) * 2020-09-16 2020-12-18 浙江大学 Knowledge graph inconsistency reasoning method based on neural network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114741460A (en) * 2022-06-10 2022-07-12 山东大学 Knowledge graph data expansion method and system based on association between rules
CN116340524A (en) * 2022-11-11 2023-06-27 华东师范大学 Method for supplementing small sample temporal knowledge graph based on relational adaptive network
CN116340524B (en) * 2022-11-11 2024-03-08 华东师范大学 Method for supplementing small sample temporal knowledge graph based on relational adaptive network
CN117591657A (en) * 2023-12-22 2024-02-23 宿迁乐享知途网络科技有限公司 Intelligent dialogue management system and method based on AI
CN117591657B (en) * 2023-12-22 2024-05-07 宿迁乐享知途网络科技有限公司 Intelligent dialogue management system and method based on AI

Also Published As

Publication number Publication date
CN112100403A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
WO2022057671A1 (en) Neural network–based knowledge graph inconsistency reasoning method
CN112232416B (en) Semi-supervised learning method based on pseudo label weighting
CN111753101B (en) Knowledge graph representation learning method integrating entity description and type
CN108399428B (en) Triple loss function design method based on trace ratio criterion
CN112217674B (en) Alarm root cause identification method based on causal network mining and graph attention network
CN111914550B (en) Knowledge graph updating method and system oriented to limited field
TWI717826B (en) Method and device for extracting main words through reinforcement learning
CN114022904B (en) Noise robust pedestrian re-identification method based on two stages
CN114998691B (en) Semi-supervised ship classification model training method and device
CN115511118A (en) Artificial intelligence-based heat supply system fault auxiliary decision-making method and system
CN113032238A (en) Real-time root cause analysis method based on application knowledge graph
CN108596204B (en) Improved SCDAE-based semi-supervised modulation mode classification model method
CN116668083A (en) Network traffic anomaly detection method and system
Lawrence et al. Explaining neural matrix factorization with gradient rollback
CN112348108A (en) Sample labeling method based on crowdsourcing mode
CN117196033A (en) Wireless communication network knowledge graph representation learning method based on heterogeneous graph neural network
CN112579777A (en) Semi-supervised classification method for unlabelled texts
Cano et al. A score based ranking of the edges for the PC algorithm
CN115174263B (en) Attack path dynamic decision method and device
WO2023273171A1 (en) Image processing method and apparatus, device, and storage medium
CN113283243B (en) Entity and relationship combined extraction method
CN115767546A (en) 5G network security situation assessment method for quantifying node risks
Vagin et al. Inductive inference and argumentation methods in modern intelligent decision support systems
CN110570093B (en) Method and device for automatically managing business expansion channel
CN104156603B (en) protein identification method based on protein interaction network and proteomics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21868496

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21868496

Country of ref document: EP

Kind code of ref document: A1