CN115809340A - Entity updating method and system of knowledge graph - Google Patents
Entity updating method and system of knowledge graph Download PDFInfo
- Publication number
- CN115809340A CN115809340A CN202211047396.6A CN202211047396A CN115809340A CN 115809340 A CN115809340 A CN 115809340A CN 202211047396 A CN202211047396 A CN 202211047396A CN 115809340 A CN115809340 A CN 115809340A
- Authority
- CN
- China
- Prior art keywords
- entity
- sentence
- similarity
- knowledge graph
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention discloses an entity updating method and system of a knowledge graph, which are used for acquiring a new knowledge graph and an original knowledge graph; based on the name attributes of the new knowledge graph and the original knowledge graph, calculating a name attribute similarity matrix; calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph; and fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph, and updating the entity corresponding to the new knowledge graph into the original knowledge graph. The advantages are that: on the basis of multi-attention entity alignment research, a fault diagnosis field knowledge graph entity alignment method combining long text name attributes and relation structure similarity calculation is provided. And a knowledge graph updating tool is developed based on the method, so that the accuracy of entity alignment is effectively improved and the efficiency of knowledge graph updating is improved through case testing and actual use.
Description
Technical Field
The invention relates to an entity updating method and system of a knowledge graph, and belongs to the technical field of cloud data center diagnosis.
Background
With the development of knowledge graph technology in the field of intelligent operation and maintenance, intelligent diagnosis, reasoning, knowledge recommendation and other technologies based on knowledge graphs are concerned by researchers. Because knowledge in the fault diagnosis knowledge graph needs to be updated according to the actual cloud data center topological structure and the actual situation, an automatic knowledge graph updating tool is needed, and the knowledge fusion of new knowledge and the original knowledge graph is realized.
A cloud data center fault diagnosis knowledge map belongs to a domain knowledge map. The fault diagnosis knowledge map has the characteristics of clear relation structure and large entity NAME attribute information amount. In the knowledge updating process, entities cannot be aligned accurately only according to the calculation of NAME attribute similarity. At present, entity alignment work focuses on entity structure and attribute information, but the attribute information of an entity is limited to a node of the entity, and interactive learning with a domain structure of the entity cannot be carried out.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art and provide a method and a system for updating an entity of a knowledge graph.
In order to solve the above technical problem, the present invention provides a method for updating an entity of a knowledge graph, including:
acquiring a new knowledge graph and an original knowledge graph;
based on the name attributes of the new knowledge graph and the original knowledge graph, calculating a name attribute similarity matrix;
calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph;
and fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph, and updating the entity corresponding to the new knowledge graph into the original knowledge graph.
Further, the calculating an entity attribute similarity matrix based on the name attributes of the new knowledge graph and the original knowledge graph includes:
extracting sentences S in the new knowledge graph by using a mode of searching character embedding matrixes Ai Word vector s Ai Sum word vector x Ai ;
Extracting the sentence S in the original knowledge graph by using a mode of searching a character embedding matrix Bi Word vector s Bi Sum word vector x Bi ;
According to the sentence S Ai Word vector s of Ai And sentence S Bi Word vector s Bi Calculating a sentence S Ai Corresponding sentence S Bi Word vector similarity of (2);
according to the sentence S Ai Word vector x of Ai And sentence S Bi Word vector x Bi Calculating a sentence S Ai Corresponding sentence S Bi The word vector similarity of (2);
sentence S Ai Corresponding sentence S Bi Word vector similarity and sentence S Ai Corresponding sentence S Bi The word vector similarity is summed and averaged to obtain a sentence S Ai Corresponding sentence S Bi Name attribute similarity S of namei ;
And acquiring the name attribute similarity of each sentence in the new knowledge graph corresponding to each sentence in the original knowledge graph to obtain a name attribute similarity matrix.
Further, the calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph includes:
and acquiring an entity relationship structure triple of each sentence in the new knowledge map, inputting the entity relationship structure triple into an entity relationship structure similarity model obtained by training the relationship structure triple based on the original knowledge map in advance, obtaining the entity structure similarity of each sentence, and obtaining an entity structure similarity matrix according to the entity structure similarity of each sentence.
Further, the training process of the entity relationship structure similarity model obtained by training the entity relationship structure triples based on the original knowledge graph includes:
constructing an entity relationship structure similarity model to be trained, and expressing as follows:
wherein the content of the first and second substances,entity vectors representing the input and output of the ith layer of domain attention, respectively;an entity vector representing the input of the l-th layer of attention of the domain, comprising an entity e i And all its neighbors; σ represents an activation function sigmoid; n is a radical of hydrogen i Representing an entity e i Set of connected entities of e j Representing an entity e i And all its neighbors, e k Representing an entity e i All neighbors of (2);representing the entity domain attention coefficient after the l layer normalization;representing an entity e i The result is fused with the neighbor j;representing an entity e i The result is fused with the neighbor k through information; exp () represents an exponential function with a natural constant e as the base; leakyReLU () represents an activation function; u is an element of R 2d(l+1)×1 And W (l) ∈R d(l+1)×d(l) Is a learnable parameter matrix; d (l) represents the network embedding dimension of the l-th layer; d (l + 1) represents the network embedding dimension of the l +1 th layer; superscript T represents matrix transposition;
constructing a pre-aligned entity seed set and positive/negative example triples;
constructing a loss function L for entity alignment for training an entity relationship structure similarity model A Expressed as:
L A =L 0 +L a
wherein L is a Entity alignment loss function, L, representing a model of similarity of entity-relationship structures 0 Representing a parameter matrixA W orthogonalization loss function, a negative sampling set e _ofan entity e and a negative sampling set e '_ of a neighbor entity e' of the entity e are constructed by adopting a nearest neighbor sampling method NS (e); d (·) =1-cos (·,) represents the inter-entity cosine distance; [. The] + = max {., 0}; gamma is a hyperparameter;
wherein, W (l) A parameter matrix representing the l-th layer; m is the number of attention network embedding layers;2, representing the 2 norm of the matrix and taking the square operation;
constructing a loss function L of a relationship structure for training an entity relationship structure similarity model R Expressed as:
wherein f (h, r, t) = | | h + r-t | | purple light 2 Representing a scoring function for a relation triple (h, r, t) for calculating a confidence of the relation triple, wherein h, t is a head-tail entity vector from a global structure embedding layer, and r is a relation vector to be modeled and learned; γ' is a hyperparameter; t is 1 Represents a positive example triplet, T 1 ' (h,r,t) { (h ', r, t) | h' ∈ E } - { (h, r, t ') | t' ∈ E } represents a negative case triplet set, h 'represents a head entity of the negative case global structure embedding layer, t' represents a tail entity of the negative case global structure embedding layer, and E represents a set including all negative case entities;
training loss function L of entity alignment respectively by using pre-aligned entity seed set and positive/negative example triples A Sum-relation-structured loss function L R And determining final model parameters of the entity relationship structure similarity model, and updating the entity relationship structure similarity model according to the final model parameters to obtain the trained entity relationship structure similarity model.
Further, the fusing the name attribute similarity matrix and the entity relationship similarity matrix includes:
respectively standardizing the name attribute similarity matrix and the entity relationship similarity matrix;
and solving the average value of the normalized name attribute similarity matrix and the normalized entity relationship similarity matrix to obtain a fused final entity similarity sentence.
A system for entity updating of knowledge-graphs, comprising:
the acquisition module is used for acquiring a new knowledge graph and an original knowledge graph;
the first calculation module is used for calculating a name attribute similarity matrix based on the name attributes of the new knowledge graph and the original knowledge graph;
the second calculation module is used for calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph;
and the updating module is used for fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph and updating the entity corresponding to the new knowledge graph into the original knowledge graph.
Further, the first calculating module is used for
Extracting sentences S in the new knowledge graph by using a mode of searching character embedding matrix Ai Word vector s Ai Sum word vector x Ai ;
Extracting the sentence S in the original knowledge graph by using a mode of searching a character embedding matrix Bi Word vector s Bi Sum word vector x Bi ;
According to the sentence S Ai Word vector s Ai And sentence S Bi Word vector s Bi Calculating a sentence S Ai Corresponding sentence S Bi Word vector similarity of (2);
according to the sentence S Ai Word vector x Ai And sentence S Bi Word vector x Bi Computing a sentence S Ai Corresponding sentence S Bi The word vector similarity of (2);
sentence S Ai Corresponding sentence S Bi Word vector similarity and sentence S Ai Corresponding sentence S Bi The word vector similarity is summed and averaged to obtain a sentence S Ai Corresponding sentence S Bi Name attribute similarity S of namei ;
And obtaining the name attribute similarity of each sentence in the new knowledge graph corresponding to each sentence in the original knowledge graph to obtain a name attribute similarity matrix.
Further, the second computing module is used for
And acquiring an entity relationship structure triple of each sentence in the new knowledge map, inputting the entity relationship structure triple into an entity relationship structure similarity model obtained by training the relationship structure triple based on the original knowledge map in advance, obtaining the entity structure similarity of each sentence, and obtaining an entity structure similarity matrix according to the entity structure similarity of each sentence.
Further, the second module is used for
Constructing an entity relationship structure similarity model to be trained, and expressing as follows:
wherein the content of the first and second substances,entity vectors representing the input and output of the ith layer of domain attention, respectively;an entity vector representing the input of the l-th layer of attention of the domain, comprising an entity e i And all its neighbors; σ represents an activation function sigmoid; n is a radical of i Representing an entity e i Set of connected entities of e j Representing an entity e i And all its neighbors, e k Representing an entity e i All neighbors of (2);representing the entity domain attention coefficient after the l level normalization;representing an entity e i The result is fused with the neighbor j;representing an entity e i The result is fused with the neighbor k through information; exp () represents an exponential function with a natural constant e as the base; leakyReLU () represents an activation function; u is an element of R 2d(l+1)×1 And W (l) ∈R d(l+1)×d(l) Is a learnable parameter matrix; d (l) represents the network embedding dimension of the l-th layer; d (l + 1) represents the network embedding dimension of the l +1 th layer; superscript T represents matrix transposition;
constructing a pre-aligned entity seed set and positive/negative case triples;
constructing a loss function L for entity alignment for training an entity relationship structure similarity model A Expressed as:
L A =L 0 +L a
wherein L is a Entity alignment loss function, L, representing a model of similarity of entity-relationship structures 0 Expressing a parameter matrix W orthogonalization loss function, and constructing a negative sampling set e _ofan entity e and a negative sampling set e '_ of a neighbor entity e' of the entity e by adopting a nearest neighbor sampling method NS (e); d (·) =1-cos (·,) represents the inter-entity cosine distance; [. The] + = max {., 0}; gamma is a hyperparameter;
wherein, the first and the second end of the pipe are connected with each other,W (l) a parameter matrix representing the l-th layer; m is the number of attention network embedding layers;2, representing the 2 norm of the matrix and taking the square operation;
constructing a loss function L of a relationship structure for training an entity relationship structure similarity model R Expressed as:
wherein f (h, r, t) = | | | h + r-t | | circuitry 2 Representing a scoring function for a relation triple (h, r, t) for calculating the confidence of the relation triple, wherein h, t is a head-tail entity vector from a global structure embedding layer, and r is a relation vector to be modeled and learned; γ' is a hyperparameter; t is a unit of 1 Represents a positive example triplet set, T 1 ' (h R, t) = { (h ', r, t) | h' ∈ E }, { (h, r, t ') | t' ∈ E } represents a negative case triplet set, h 'represents a head entity of the negative case global structure embedding layer, t' represents a tail entity of the negative case global structure embedding layer, and E represents a set including all negative case entities;
training loss function L of entity alignment respectively by using pre-aligned entity seed set and positive/negative example triples A Sum-relation-structured loss function L R And determining final model parameters of the entity relationship structure similarity model, and updating the entity relationship structure similarity model according to the final model parameters to obtain the trained entity relationship structure similarity model.
Further, the update module is used for
Respectively standardizing the name attribute similarity matrix and the entity relationship similarity matrix;
and solving the average value of the normalized name attribute similarity matrix and the normalized entity relationship similarity matrix to obtain a fused final entity similarity sentence.
A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods.
A computing device, comprising, in combination,
one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods.
The invention achieves the following beneficial effects:
on the basis of multi-attention entity alignment research, a fault diagnosis field knowledge graph entity alignment method combining long text name attributes and relation structure similarity calculation is provided. And a knowledge graph updating tool is developed based on the method, and through case testing and actual use, the accuracy of entity alignment is effectively improved, and the efficiency of knowledge graph updating is improved.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a block diagram of the overall framework of the method for entity update of a knowledge-graph of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
As shown in fig. 1, the present invention discloses an entity updating method of a knowledge graph, which comprises:
acquiring a new knowledge graph and an original knowledge graph;
based on the name attributes of the new knowledge graph and the original knowledge graph, calculating a name attribute similarity matrix;
calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph;
and fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph, and updating the entity corresponding to the new knowledge graph into the original knowledge graph.
Fig. 2 is a specific framework diagram of the method for updating an entity of a knowledge graph, in which a structural channel converts structural features included in a relationship triple of an entity into a graph entity feature vector, and an entity similarity matrix is obtained through similarity calculation. And calculating an entity attribute similarity matrix by adopting cosine similarity based on the name attribute through the attribute channel. And finally, obtaining an entity corresponding to the new knowledge through the fusion of the structural similarity and the attribute similarity, updating the new knowledge into the original map on the basis of manual inspection, and modifying or increasing the attributes and the relation content in the original map.
The method specifically comprises the following steps:
first, vector generation. The model provided by the invention fuses word segmentation characteristic vectors of words and word vectors, the input of the model is a sentence and all self-matching words in the sentence, and the self-matching words of characters refer to the words containing the characters. With s = { c 1 ,c 2 ,...c n Denotes this sentence, where c i Representing the ith word in the sentence. Representing each character in the sentence as a vector x by searching a character embedding matrix i :
x i =e c (c i )
Wherein e c Is a character-embedded look-up table, c i Are characters in a sentence.
The model carries out word segmentation on the sentence through a word segmentation tool and labels data in a training set to construct word segmentation characteristics, and the obtained word vector containing word boundary information is expressed as follows:
wherein x i Representing the word vector, s, to which the word corresponds i The word feature vector corresponding to the word is represented,representing vector stitching, c i After the representation fusionA word vector representation of;
and secondly, generating a self-matching word vector. To represent semantic information for a word, a vector representation of self-matching words is obtained, and the vocabulary in the model that the input sentence can match is represented as l = { z = 1 ,z 2 ,...z m Expressing each word as a semantic vector z by looking up a pre-trained word embedding matrix i :
z i =e w (l i )
Finally, splicing the word vectors and the word vectors to obtain the final output representation of the embedded layer:
Node f =[v 1 ,v 2 ,....v n ]=[c 1 ,c 2 ,....,z 1 ,z 2 ,....z m ]
wherein v is i Representing the final word vector representation, c i Is a word vector representation, z i Is a self-matching word vector representation;
and thirdly, calculating the similarity of the attributes. After the entity name attribute is converted into a word vector and a word vector, the similarity of the two name attributes is calculated based on cosine similarity, the closer the value is to 1, the closer the included angle is to 0, namely the more similar the two vectors are. The calculation formula is as follows:
wherein, node A And Node B A word vector representing two name attributes to be matched. Based on the formula, the attribute similarity between the entity to be updated and each entity name in the original map can be obtained.
In order to avoid the influence of a large number of repeated professional vocabularies on similarity calculation, the similarity of the two name attributes is calculated by adopting the Jaccard similarity. The calculation formula is as follows:
wherein NAME A And NAME B Representing two NAME attributes to be matched,andrepresents NAME A And NAME B The ith word in (b) is # | { 8230 { \8230 { } andn { 8230 {, where | denotes the number of elements in the intersection of two sets, # | { 8230 { \ 8230 {, where | denotes the number of elements in the merged set of two sets.
The similarity of NAME attributes is defined as the mean of cosine similarity and Jaccard similarity:
and fourthly, improving the GAT. And inputting a relation triple sequence of the entity, and calculating attention coefficients between the entity and each neighbor entity according to the thought of GAT, wherein the attention coefficients are used as weights for aggregating the characteristics of the neighbor entities. The computation of learning entity embedding with a relationship structure employs GAT:
wherein the content of the first and second substances,entity vectors representing the input and output of the ith layer of domain attention, respectively; n is a radical of i Representing an entity e i A set of connected entities;representing the entity domain attention coefficient after the l level normalization; u is formed by R 2d(l+1)×1 And W (l) ∈R d(l +1)×d(l) Is a learnable parameter matrix; d is a radical of (l) Representing the network embedding dimension of the l-th layer. Inspired by words, the traditional GAT model is improved, the orthogonalization limitation is applied to the conversion matrix W, and the W orthogonalization loss is learned, so that the purpose is to keep the relative distribution between entities in the network embedding layer and the conversion process and keep more real entity structure information. The calculation formula of the parameter matrix W orthogonalization loss is as follows:
wherein: w (l) A parameter matrix representing the l-th layer; m is the number of attention network embedding layers;2, representing the 2 norm of the matrix and taking the square operation;
and fifthly, aligning the damage. According to the requirements, a loss function is trained according to a pre-aligned entity seed set (an entity seed set is a set containing all entities of a knowledge graph, and a positive example triple is a set containing an entity and its neighbor entities and their relations). The penalty function for the alignment of the attribute path training entity is:
L A =L 0 +L a
wherein: NS (e) represents a negative example sampling set of the entity e, and the negative example set of the entity e is constructed by adopting a nearest neighbor sampling method; d (·) =1-cos (·,) representsThe cosine distance between entities; [. The] + = max {., 0}; is a hyper-parameter;
and sixthly, modeling a relation structure. The method adopts the TransE model idea to calculate the similarity of the entity relationship structure. Given a relation triple (h, r, t), training the model based on the entity vector embedded by the global structure to enable h + r to be approximately equal to t, and further constraining the embedded representation of the head-tail entity while modeling the relation structure.
The formula for calculating the loss function of the training relationship structure part is as follows:
wherein: f (h, r, t) = | | | h + r-t | | the hair holes 2 Representing a scoring function for the triples, and calculating the confidence of the triples, wherein h and t are head and tail entity vectors from a global structure embedding layer, and r is a relation vector to be modeled and learned; is a hyper-parameter; t is a unit of 1 Represents a positive example triplet, T 1 ' (h,r,t) And (h, r, t) | h ' is belonged to E } - { (h, r, t ') | t ' is belonged to E } represents a negative example triplet set, and the construction is carried out by randomly replacing head and tail entities with the same relationship type. When the model is trained, positive triples are given a lower score, negative triples are given a higher score, and the positive triples and the negative triples are distinguished by using the maximum interval, so that the entity is fused into a relationship structure in the embedding process, and the capability of the model for distinguishing the entity is improved. Training alignment and relation structure modeling loss simultaneously according to the pre-aligned entity seed set and the positive/negative example triples, updating model parameters in a global structure embedding layer and a local semantic optimization layer, and learning the influence of a topological structure and a relation structure on entity embedding simultaneously based on an entity alignment task. And obtaining the updated entity feature vector to obtain an entity similarity matrix S under the structural channel. The formula for calculating the aligned loss function of the structural channel training entity is as follows:
L s =L A +L R
by modeling the relationship structure similarity, the entity structure similarity can be calculated to obtain a structure similarity matrix S relation 。
And step seven, fusing the layers. Respectively obtaining a name attribute similarity matrix and a structural relationship similarity matrix S from the attribute channel and the structural channel relation . Firstly, the two matrixes are standardized to eliminate the influence of different characteristic dimension levels of the entity, the weight of the two matrixes is the same in the fusion process, and the final entity similarity is the mean value of the two matrixes. The similarity mean normalization and fusion formula is shown below:
the following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
Based on the method provided by the invention, under the condition of considering the structural similarity and the attribute name similarity, the structural relationship (C _ new1, cause and bug) of the target entity is brought into, the similarity calculation result is 0.75, and the similar entity recognition degree is greatly improved. The relationship structure < C _ new2, cause, communication interruption > substituted into C _ new2, the calculation result is 0.51. Therefore, the similarity calculation result difference between the target entity and other unrelated reason entities is obviously enlarged.
Correspondingly, the invention also provides an entity updating system of the knowledge graph, which comprises the following steps:
the acquisition module is used for acquiring a new knowledge graph and an original knowledge graph;
the first calculation module is used for calculating a name attribute similarity matrix based on the name attributes of the new knowledge graph and the original knowledge graph;
the second calculation module is used for calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph;
and the updating module is used for fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph and updating the entity corresponding to the new knowledge graph into the original knowledge graph.
Further, the first calculating module is used for
Extracting sentences S in the new knowledge graph by using a mode of searching character embedding matrix Ai Word vector s of Ai Sum word vector x Ai ;
Extracting sentences S in the original knowledge graph by using a mode of searching character embedding matrixes Bi Word vector s Bi Sum word vector x Bi ;
According to the sentence S Ai Word vector s of Ai And sentence S Bi Word vector s Bi Calculating a sentence S Ai Corresponding sentence S Bi Word vector similarity of (2);
according to the sentence S Ai Word vector x Ai And sentence S Bi Word vector x Bi Calculating a sentence S Ai Corresponding sentence S Bi The word vector similarity of (2);
sentence S Ai Corresponding sentence S Bi Word vector similarity and sentence S Ai Corresponding sentence S Bi The word vector similarity is summed and averaged to obtain a sentence S Ai Corresponding sentence S Bi Name attribute similarity S of namei ;
And acquiring the name attribute similarity of each sentence in the new knowledge graph corresponding to each sentence in the original knowledge graph to obtain a name attribute similarity matrix.
Further, the second computing module is used for
And acquiring an entity relationship structure triple of each sentence in the new knowledge map, inputting the entity relationship structure triple into an entity relationship structure similarity model obtained by training the relationship structure triple based on the original knowledge map in advance, obtaining the entity structure similarity of each sentence, and obtaining an entity structure similarity matrix according to the entity structure similarity of each sentence.
Further, the second module is used for
Constructing an entity relationship structure similarity model to be trained, wherein the model is expressed as follows:
wherein the content of the first and second substances,entity vectors representing the input and output of the ith layer of domain attention, respectively;an entity vector representing the input of the l-th layer of attention of the domain, comprising an entity e i And all its neighbors; σ represents an activation function sigmoid; n is a radical of hydrogen i Representing an entity e i Set of connected entities of e j Representing an entity e i And all its neighbors, e k Representing an entity e i All neighbors of (2);representing the entity domain attention coefficient after the l level normalization;representing an entity e i The result is fused with the neighbor j;representing an entity e i The result is fused with the neighbor k through information; exp () represents an exponential function with a natural constant e as the base; leakyReLU () represents an activation function; u is formed by R 2d(l+1)×1 And W (l) ∈R d(l+1)×d(l) Is a learnable parameter matrix; d (l) represents the network embedding dimension of the l-th layer;d (l + 1) represents the network embedding dimension of the l +1 th layer; superscript T represents matrix transposition;
constructing a pre-aligned entity seed set and positive/negative case triples;
constructing a loss function L for entity alignment for training an entity relationship structure similarity model A Expressed as:
L A =L 0 +L a
wherein L is a Entity alignment loss function, L, representing a model of similarity of entity-relationship structures 0 Expressing a parameter matrix W orthogonalization loss function, and constructing a negative sampling set e _ofan entity e and a negative sampling set e 'of a neighbor entity e' of the entity e by adopting a nearest neighbor sampling method NS (e); d (·) =1-cos (·,) represents the inter-entity cosine distance; [. For] + = max { ·,0}; gamma is a hyperparameter;
wherein, W (l) A parameter matrix representing the l-th layer; m is the number of attention network embedding layers;2, solving a 2-norm of a matrix and performing a squaring operation;
constructing a loss function L of a relationship structure for training an entity relationship structure similarity model R Expressed as:
wherein f (h, r, t) = | | h + r-t | | purple light 2 Representing a scoring function for a relation triple (h, r, t) for calculating a confidence of the relation triple, wherein h, t is a head-tail entity vector from a global structure embedding layer, and r is a relation vector to be modeled and learned; γ' is a hyperparameter; t is a unit of 1 Represents a positive example triplet set, T 1'(h,r,t) { (h ', r, t) | h' ∈ E } - { (h, r, t ') | t' ∈ E } represents a negative case triplet set, h 'represents a head entity of the negative case global structure embedding layer, t' represents a tail entity of the negative case global structure embedding layer, and E represents a set including all negative case entities;
respectively training loss function L of entity alignment by using pre-aligned entity seed set and positive/negative example triples A Loss function L of sum relation structure R And determining final model parameters of the entity relationship structure similarity model, and updating the entity relationship structure similarity model according to the final model parameters to obtain the trained entity relationship structure similarity model.
Further, the update module is used for
Respectively standardizing the name attribute similarity matrix and the entity relationship similarity matrix;
and solving the average value of the normalized name attribute similarity matrix and the normalized entity relationship similarity matrix to obtain a fused final entity similarity sentence.
The present invention accordingly also provides a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods described.
The invention also provides a computing device, comprising,
one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various improvements and modifications without departing from the technical principle of the present invention, and those improvements and modifications should be considered as the protection scope of the present invention.
Claims (12)
1. A method for entity update of a knowledge graph, comprising:
acquiring a new knowledge graph and an original knowledge graph;
based on the name attributes of the new knowledge graph and the original knowledge graph, calculating a name attribute similarity matrix;
calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph;
and fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph, and updating the entity corresponding to the new knowledge graph into the original knowledge graph.
2. The method for entity updating of a knowledge-graph of claim 1, wherein the calculating of the entity attribute similarity matrix based on the name attributes of the new knowledge-graph and the original knowledge-graph comprises:
extracting sentences S in the new knowledge graph by using a mode of searching character embedding matrix Ai Word vector s of Ai Sum word vector x Ai ;
Extracting the sentence S in the original knowledge graph by using a mode of searching a character embedding matrix Bi Word vector s of Bi Sum word vector x Bi ;
According to the sentence S Ai Word vector s Ai And sentence S Bi Word vector s of Bi Computing a sentence S Ai Corresponding sentence S Bi Word vector similarity of (2);
according to sentence S Ai Word vector x Ai And sentence S Bi Word vector x Bi Calculating a sentence S Ai Corresponding sentence S Bi The word vector similarity of (2);
sentence S Ai Corresponding sentence S Bi Word vector similarity and sentence S Ai Corresponding sentence S Bi The word vector similarity is summed and averaged to obtain a sentence S Ai Corresponding sentence S Bi Name attribute similarity S of namei ;
And obtaining the name attribute similarity of each sentence in the new knowledge graph corresponding to each sentence in the original knowledge graph to obtain a name attribute similarity matrix.
3. The method for entity updating of a knowledge graph according to claim 1, wherein the calculating of the entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph comprises:
and acquiring an entity relationship structure triple of each sentence in the new knowledge map, inputting the entity relationship structure triple into an entity relationship structure similarity model obtained by training the relationship structure triple based on the original knowledge map in advance, obtaining the entity structure similarity of each sentence, and obtaining an entity structure similarity matrix according to the entity structure similarity of each sentence.
4. The method for updating the knowledge-graph entity according to claim 3, wherein the training process of the entity relationship structure similarity model obtained by training the entity relationship structure triples based on the original knowledge-graph comprises:
constructing an entity relationship structure similarity model to be trained, and expressing as follows:
wherein, the first and the second end of the pipe are connected with each other,entity vectors representing input and output of the ith layer of domain attention respectively;an entity vector representing the input of the l-th layer of attention of the domain, comprising an entity e i And all its neighbors; σ represents an activation function sigmoid; n is a radical of i Representing an entity e i Set of connected entities of e j Representing an entity e i And all its neighbors, e k Representing an entity e i All neighbors of (2);representing the entity domain attention coefficient after the l layer normalization;representing an entity e i The result is fused with the neighbor j;representing an entity e i The result is fused with the neighbor k through information; exp () represents an exponential function with a natural constant e as the base; leakyReLU () represents an activation function; u is formed by R 2d(l+1)×1 And W (l) ∈R d(l+1)×d(l) Is a learnable parameter matrix; d (l) represents the network embedding dimension of the l-th layer; d (l + 1) represents the network embedding dimension of the l +1 th layer; superscript T represents matrix transposition;
constructing a pre-aligned entity seed set and positive/negative case triples;
constructing a loss function L for entity alignment for training an entity relationship structure similarity model A Expressed as:
L A =L 0 +L a
wherein L is a Entity alignment loss function, L, representing an entity-relationship structural similarity model 0 Representing the orthogonalization loss of a parameter matrix WA function, a negative sampling set e _ofan entity e and a negative sampling set e '_ of a neighbor entity e' of the entity e are constructed by adopting a nearest neighbor sampling method NS (e); d (·) =1-cos (·,) represents the inter-entity cosine distance; [. The] + = max {., 0}; gamma is a hyperparameter;
wherein, W (l) A parameter matrix representing the l-th layer; m is the number of attention network embedding layers;2, representing the 2 norm of the matrix and taking the square operation;
constructing a loss function L of a relationship structure for training an entity relationship structure similarity model R Expressed as:
wherein f (h, r, t) = | | h + r-t | | purple light 2 Representing a scoring function for a relation triple (h, r, t) for calculating a confidence of the relation triple, wherein h, t is a head-tail entity vector from a global structure embedding layer, and r is a relation vector to be modeled and learned; γ' is a hyperparameter; t is a unit of 1 Represents a positive ternary set, T' 1(h,r,t) { (h ', r, t) | h' ∈ E } - { (h, r, t ') | t' ∈ E } represents a negative case triplet set, h 'represents a head entity of the negative case global structure embedding layer, t' represents a tail entity of the negative case global structure embedding layer, and E represents a set including all negative case entities;
training loss function L of entity alignment respectively by using pre-aligned entity seed set and positive/negative example triples A Loss function L of sum relation structure R And determining final model parameters of the entity relationship structure similarity model, and updating the entity relationship structure similarity model according to the final model parameters to obtain the trained entity relationship structure similarity model.
5. The method for updating an entity of a knowledge graph of claim 4, wherein the fusing the name attribute similarity matrix and the entity relationship similarity matrix comprises:
respectively standardizing the name attribute similarity matrix and the entity relationship similarity matrix;
and solving the average value of the normalized name attribute similarity matrix and the normalized entity relationship similarity matrix to obtain a fused final entity similarity sentence.
6. An entity update system for a knowledge graph, comprising:
the acquisition module is used for acquiring a new knowledge graph and an original knowledge graph;
the first calculation module is used for calculating a name attribute similarity matrix based on the name attributes of the new knowledge graph and the original knowledge graph;
the second calculation module is used for calculating an entity relationship similarity matrix based on the entity relationship structure triples of the new knowledge graph and the original knowledge graph;
and the updating module is used for fusing the name attribute similarity matrix and the entity relationship similarity matrix to obtain an entity corresponding to the new knowledge graph and updating the entity corresponding to the new knowledge graph into the original knowledge graph.
7. The knowledgeable graph entity update identity of claim 6, wherein said first computing module is configured to
Extracting sentences S in the new knowledge graph by using a mode of searching character embedding matrixes Ai Word vector s Ai Sum word vector x Ai ;
Extracting the sentence S in the original knowledge graph by using a mode of searching a character embedding matrix Bi Word vector s of Bi Sum word vector x Bi ;
According to the sentence S Ai Word vector s Ai And sentence S Bi Word vector s of Bi Calculating a sentence S Ai Corresponding sentence S Bi Word vector similarity of (2);
according to the sentence S Ai Word vector x Ai And sentence S Bi Word vector x Bi Computing a sentence S Ai Corresponding sentence S Bi The word vector similarity of (2);
sentence S Ai Corresponding sentence S Bi Word vector similarity and sentence S Ai Corresponding sentence S Bi The word vector similarity is summed and averaged to obtain a sentence S Ai Corresponding sentence S Bi Name attribute similarity S of namei ;
And obtaining the name attribute similarity of each sentence in the new knowledge graph corresponding to each sentence in the original knowledge graph to obtain a name attribute similarity matrix.
8. The knowledgeable graph entity update identity of claim 6, wherein said second computing module is configured to
And acquiring an entity relationship structure triple of each sentence in the new knowledge map, inputting the entity relationship structure triple into an entity relationship structure similarity model obtained by training the relationship structure triple based on the original knowledge map in advance, obtaining the entity structure similarity of each sentence, and obtaining an entity structure similarity matrix according to the entity structure similarity of each sentence.
9. The knowledgegraph entity update identity of claim 8, wherein said second module is configured to update identity of a knowledgegraph entity
Constructing an entity relationship structure similarity model to be trained, and expressing as follows:
wherein, the first and the second end of the pipe are connected with each other,entity vectors representing input and output of the ith layer of domain attention respectively;an entity vector representing the input of the l-th layer of attention of the domain, comprising an entity e i And all its neighbors; σ represents an activation function sigmoid; n is a radical of i Representing an entity e i Set of connected entities of e j Representing an entity e i And all its neighbors, e k Representing an entity e i All neighbors of (2);representing the entity domain attention coefficient after the l level normalization;representing an entity e i The result is fused with the neighbor j;representing an entity e i The result is fused with the neighbor k through information; exp () represents an exponential function with a natural constant e as the base; leakyReLU () represents an activation function; u is an element of R 2d(l+1)×1 And W (l) ∈R d(l+1)×d(l) Is a learnable parameter matrix; d (l) represents the network embedding dimension of the l-th layer; d (l + 1) represents the network embedding dimension of the l +1 th layer; superscript T represents matrix transposition;
constructing a pre-aligned entity seed set and positive/negative case triples;
constructing a loss function L for entity alignment for training an entity relationship structure similarity model A Expressed as:
L A =L 0 +L a
wherein L is a Entity alignment loss function, L, representing an entity-relationship structural similarity model 0 Expressing a parameter matrix W orthogonalization loss function, and constructing a negative sampling set e _ofan entity e and a negative sampling set e 'of a neighbor entity e' of the entity e by adopting a nearest neighbor sampling method NS (e); d (·, =1-cos (·, ·) denotes an inter-entity cosine distance; [. For] + = max {., 0}; gamma is a hyperparameter;
wherein, W (l) A parameter matrix representing the l-th layer; m is the number of attention network embedding layers;2, representing the 2 norm of the matrix and taking the square operation;
constructing a loss function L of a relationship structure for training an entity relationship structure similarity model R Expressed as:
wherein f (h, r, t) = | | h + r-t | | purple light 2 Representing a scoring function for a relation triple (h, r, t) for calculating a confidence of the relation triple, wherein h, t is a head-tail entity vector from a global structure embedding layer, and r is a relation vector to be modeled and learned; γ' is a hyperparameter; t is 1 Represents a positive case ternary set, T' 1(h,r,t) { (h ', r, t) | h' ∈ E } - { (h, r, t ') | t' ∈ E } represents a negative-case triplet set, h 'represents a head entity of the negative-case global structure embedding layer, t' represents a tail entity of the negative-case global structure embedding layer, and E represents a tail entity including all negative-case entitiesA set of (a);
training loss function L of entity alignment respectively by using pre-aligned entity seed set and positive/negative example triples A Sum-relation-structured loss function L R And determining final model parameters of the entity relationship structure similarity model, and updating the entity relationship structure similarity model according to the final model parameters to obtain the trained entity relationship structure similarity model.
10. The knowledgegraph entity update identity of claim 9, wherein the update module is configured to update the knowledge graph entity update identity
Respectively standardizing the name attribute similarity matrix and the entity relationship similarity matrix;
and solving the average value of the normalized name attribute similarity matrix and the normalized entity relationship similarity matrix to obtain a fused final entity similarity sentence.
11. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-5.
12. A computing device, comprising,
one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211047396.6A CN115809340A (en) | 2022-08-29 | 2022-08-29 | Entity updating method and system of knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211047396.6A CN115809340A (en) | 2022-08-29 | 2022-08-29 | Entity updating method and system of knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115809340A true CN115809340A (en) | 2023-03-17 |
Family
ID=85482426
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211047396.6A Pending CN115809340A (en) | 2022-08-29 | 2022-08-29 | Entity updating method and system of knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115809340A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116150405A (en) * | 2023-04-19 | 2023-05-23 | 中电科大数据研究院有限公司 | Heterogeneous data processing method for multiple scenes |
CN116226541A (en) * | 2023-05-11 | 2023-06-06 | 湖南工商大学 | Knowledge graph-based network hotspot information recommendation method, system and equipment |
-
2022
- 2022-08-29 CN CN202211047396.6A patent/CN115809340A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116150405A (en) * | 2023-04-19 | 2023-05-23 | 中电科大数据研究院有限公司 | Heterogeneous data processing method for multiple scenes |
CN116150405B (en) * | 2023-04-19 | 2023-06-27 | 中电科大数据研究院有限公司 | Heterogeneous data processing method for multiple scenes |
CN116226541A (en) * | 2023-05-11 | 2023-06-06 | 湖南工商大学 | Knowledge graph-based network hotspot information recommendation method, system and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110825881B (en) | Method for establishing electric power knowledge graph | |
CN115809340A (en) | Entity updating method and system of knowledge graph | |
CN107967255A (en) | A kind of method and system for judging text similarity | |
CN109857846B (en) | Method and device for matching user question and knowledge point | |
Lin et al. | Deep structured scene parsing by learning with image descriptions | |
CN112800774B (en) | Entity relation extraction method, device, medium and equipment based on attention mechanism | |
CN112711953A (en) | Text multi-label classification method and system based on attention mechanism and GCN | |
CN106997373A (en) | A kind of link prediction method based on depth confidence network | |
CN106778878A (en) | A kind of character relation sorting technique and device | |
CN113157919A (en) | Sentence text aspect level emotion classification method and system | |
CN115328782A (en) | Semi-supervised software defect prediction method based on graph representation learning and knowledge distillation | |
CN114841151A (en) | Medical text entity relation joint extraction method based on decomposition-recombination strategy | |
CN113033186B (en) | Error correction early warning method and system based on event analysis | |
CN113420117B (en) | Sudden event classification method based on multivariate feature fusion | |
CN114840685A (en) | Emergency plan knowledge graph construction method | |
WO2020240572A1 (en) | Method for training a discriminator | |
CN113779190A (en) | Event cause and effect relationship identification method and device, electronic equipment and storage medium | |
CN112905750A (en) | Generation method and device of optimization model | |
CN117494760A (en) | Semantic tag-rich data augmentation method based on ultra-large-scale language model | |
CN115358477B (en) | Fight design random generation system and application thereof | |
CN116680407A (en) | Knowledge graph construction method and device | |
CN115600595A (en) | Entity relationship extraction method, system, equipment and readable storage medium | |
Li et al. | Evaluating BERT on cloud-edge time series forecasting and sentiment analysis via prompt learning | |
CN113392220A (en) | Knowledge graph generation method and device, computer equipment and storage medium | |
CN113094504A (en) | Self-adaptive text classification method and device based on automatic machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |