CN112818134A

CN112818134A - Knowledge graph completion method based on rules and embedding

Info

Publication number: CN112818134A
Application number: CN202110197370.9A
Authority: CN
Inventors: 李晋; 朴贤姬; 项金鹏; 程建华; 白玉; 王春波
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2021-02-22
Filing date: 2021-02-22
Publication date: 2021-05-18

Abstract

A rule and embedding based knowledge graph completion method belongs to the technical field of campus overlord behavior prediction. The invention solves the problem that the existing research method can not simultaneously give consideration to both accuracy and usability. The method comprises the steps of firstly defining relation triples in the knowledge graph, carrying out first completion on the knowledge graph based on four relation rules, and then carrying out second completion on the knowledge graph based on the knowledge graph subjected to first completion by using an embedded knowledge graph completion method. By the method, the knowledge graph completion method with high accuracy and good usability can be provided. The method can be applied to campus rabdosia behavior prediction.

Description

Knowledge graph completion method based on rules and embedding

Technical Field

The invention belongs to the technical field of campus tyrant behavior prediction, and particularly relates to a rule and embedding-based knowledge graph completion method.

Background

With the rapid development of society, internet technology has also advanced rapidly, and more individuals and organizations use the internet to store and calculate hundreds of millions of data with associations. This development has also led to an explosive increase in data on the internet. How to represent the large-scale relationship data more completely also becomes a hot spot of research at home and abroad. The knowledge graph is an intuitive, efficient and accurate knowledge representation mode suitable for processing a large amount of multi-relation data, and large-scale relation data in the Internet can be visually represented by applying the knowledge graph. Although the knowledge graph brings many benefits to people, the missing relationship in the knowledge graph also brings many problems to practical application. Complementing the knowledge graph is to perform knowledge reasoning based on the existing knowledge graph, that is, to reason implicit relationship data according to knowledge in the existing knowledge graph and add the inferred relationship data to the existing knowledge graph. The completion of the knowledge graph can help people to mine more useful information, has the function of non-wear-out on the subsequent data analysis work, and is also an important direction of the current domestic and foreign research.

The conventional knowledge-graph complete-supplement research is mainly divided into a rule-based knowledge-graph complete-supplement method and an embedded knowledge-graph complete-supplement method, wherein the rule-based knowledge-graph complete-supplement method has the advantages of higher accuracy and stronger interpretability, but has poorer generalization capability and higher complexity. The latter has the advantage of better usability and scalability, but has the disadvantage of less accuracy and less interpretability. Therefore, the existing research methods cannot simultaneously achieve high accuracy and good usability.

Disclosure of Invention

The invention aims to solve the problem that the existing research method cannot simultaneously consider both accuracy and usability, and provides a rule and embedding-based knowledge graph complementing method.

The technical scheme adopted by the invention for solving the technical problems is as follows: a rule and embedded knowledge graph-based completion method specifically comprises the following steps:

step S1, defining relation triples in the knowledge graph, wherein the relation triples comprise entity-entity triples and entity-attribute triples;

s2, complementing the entity-entity triples by using the transfer rule, the anti-symmetric rule and the entity association rule, and complementing the entity-attribute triples by using the attribute association rule to obtain a knowledge graph subjected to first complementing;

step S3, converting the entity, attribute, entity relationship and attribute relationship in the knowledge graph after the first completion into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector and an attribute relationship embedding vector respectively;

taking the entity embedding vector, the attribute embedding vector, the entity relationship embedding vector and the attribute relationship embedding vector as the input of a TransE model, defining a score function on an entity-entity triple, a loss function on the entity-entity triple, a score function on the entity-attribute triple and a loss function on the entity-attribute triple, and training the entity-entity triple and the entity-attribute triple by using a TransE model training algorithm to obtain the trained entity embedding vector, attribute embedding vector, entity relationship embedding vector and attribute relationship embedding vector;

completing the completion of the knowledge graph after performing secondary completion on the knowledge graph by using an entity prediction algorithm, an attribute prediction algorithm and a relation prediction algorithm;

further, the specific process of step S1 is as follows:

s11, all entities in the knowledge graph form an entity set, all attributes form an attribute set, all entity relations form an entity relation set, and all attribute relations form an attribute relation set;

s12, defining an entity-entity triple based on the entity in the entity set and the entity relationship in the entity relationship set;

s13, defining an entity-attribute triple based on the entity in the entity set, the attribute in the attribute set and the attribute relationship in the attribute relationship set;

further, the completing the entity-entity triplet by using the transfer rule includes:

s211, the transmission rule is defined as

Wherein e is_i，e_j，e_kRepresenting an entity, e_i，e_j，e_kE, E represents the set of entities, re_l，re_m，re_nRepresenting entity relationships, re_l，re_m，re_nE.g. Re, Re represents the set of entity relationships, (. DEG)_eRepresenting entity-entity triples, and extracting all specific transfer rules which accord with the form of the transfer rules from the knowledge graph;

s212, extracting the key information of the specific delivery rule to form a delivery rule candidate tc (re)_l，re_m，re_n)_cAll the transfer rule candidates constitute a transfer rule candidate set Stc;

s213, filtering each transmission rule candidate tc by using the transmission rule candidate filtering algorithm to obtain a transmission rule correct candidate trc (re ═_l，re_m，re_n)_rAll the correct candidates of the transfer rule form a correct candidate set Strc of the transfer rule;

s214, obtaining a transfer rule completion entity-entity triple tet by using two known entity-entity triples in the knowledge graph and the corresponding transfer rule correct candidate trc, wherein the transfer rule completion entity-entity triples form a transfer rule completion entity-entity triple set Stet, Stet ═ tet₁，tet₂，...，tet_|Stet|}，tet₁Representing that the 1 st transmission rule in the three-tuple set Stet completes the entity-entity triplets, | Stet | is the number of the transmission rule contained in the three-tuple set Stet to complete the entity-entity triplets;

s215, adding all entity-entity triples in the transfer rule completion entity-entity triplet set Stet into the knowledge graph to complete the knowledge graph completion based on the transfer rule;

further, the completion of the entity-entity triplet by using the antisymmetric rule includes the following specific processes:

s221, the antisymmetric rule is defined as

Wherein e is_i，e_jRepresenting an entity, e_i，e_jE, E represents the set of entities, re_l，re_mRepresenting entity relationships, re_l，re_mThe element belongs to Re, the Re represents an entity relation set, and all specific antisymmetric rules which accord with the antisymmetric rule form are extracted from the knowledge graph;

s222, extracting key information of specific antisymmetric rules to form antisymmetric rule candidate ac, wherein ac is (re)_l，re_m)_cAll antisymmetric rule candidates constitute an antisymmetric rule candidate set Sac;

s223, filtering each antisymmetric rule candidate ac by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate arc, wherein the arc is (re)_l，re_m)_rAll antisymmetric rule correct candidates constitute an antisymmetric rule correct candidate set Sarc;

s224, obtaining an antisymmetric rule completion entity-entity triplet tea by using the known entity-entity triplets in the knowledge graph and the corresponding antisymmetric rule correct candidate arc, where all antisymmetric rule completion entity-entity triplets form an antisymmetric rule completion entity-entity triplet Stea, and Stea { (tea) }₁，tea₂，...，tea_|Stea|}，tea₁Representing the 1 st antisymmetric rule completion entity-entity triple in the three-tuple set Stea, wherein | Stea | is the number of antisymmetric rule completion entity-entity triples contained in the three-tuple set Stea;

s225, adding all entity-entity triples in the entity-entity triplet set Stea to the knowledge graph to complete the knowledge graph completion based on the antisymmetric rules;

further, the completing the entity-entity triple by using the entity association rule includes the following specific processes:

s231, defining the entity association rule as

Wherein e is_i，e_j，e_kE and E_j≠e_k，e_i，e_j，e_kRepresenting an entity, E representing a set of entities, re_l，re_m∈Re，re_l，re_mRepresenting entity relations, Re representing an entity relation set, and extracting all concrete entity association rules which accord with the entity association rule form from the knowledge graph;

s232, extracting key information of the specific entity association rule to form an entity association rule candidate rc (re ═ re)_l，e_j，re_m，e_k)_crAll entity association rule candidates form an entity association rule candidate set Src;

s233, filtering each entity association rule candidate rc by using the entity association rule candidate filtering algorithm to obtain an entity association rule correct candidate rrc (re ═ rrc)₁，e_j，re_m，e_k)_rrAll the correct entity association rule candidates form an entity association rule correct candidate set Srrc;

s234, obtaining an entity association rule completion entity-entity triple ter by using the known entity-entity triple in the knowledge graph and the entity association rule correct candidate rrc corresponding to the entity association rule triple, wherein the entity association rule completion entity-entity triple constitutes an entity association rule completion entity-entity triple set Stet, and Ster is { ter } entity-entity triple₁，ter₂，...，ter_|Ster|}，ter₁Representing that the 1 st entity association rule in the three-tuple set Ster completes the entity-entity triples, | Ster | is the number of the entity association rule completion entity-entity triples contained in the three-tuple set Ster;

s235, adding all entity-entity triples in the entity-entity triplet set Ster to the knowledge graph to complete the knowledge graph completion based on the entity association rule;

further, the completion of the entity-attribute triple by using the attribute association rule includes the following specific processes:

s241, defining the attribute association rule as

Wherein e_i∈E，e_iRepresenting an entity, E representing a set of entities, a_j，a_kIs epsilon of A and a_j≠a_k，a_j，a_kRepresenting attributes, A representing a set of attributes, ra_l，ra_m∈Ra，ra_l，ra_mRepresents the attribute relationship, Ra represents the set of attribute relationships, (-)_aRepresenting entity-attribute triples, and extracting all specific attribute association rules which accord with the attribute association rule form from the knowledge graph;

s242, extracting key information of the specific attribute association rule to form an attribute association rule candidate atrc (ra)₁，a_j，ra_m，a_k)_carAll the attribute association rule candidates form an attribute association rule candidate set Satrc;

s243, filtering each attribute association rule candidate atrc by using the attribute association rule candidate filtering algorithm to obtain an attribute association rule correct candidate arrc, wherein arrc is (ra)_l，a_j，ra_m，a_k)_aarAll the correct attribute association rule candidates form an attribute association rule correct candidate set Sarrc;

s244, obtaining attribute association rule completion entity-attribute triple tar by using the known entity-attribute triple in the knowledge graph and the attribute association rule correct candidate arrc corresponding to the known entity-attribute triple, wherein all the attribute association rule completion entity-attribute triples form an attribute association rule completion entity-attribute triple set Star, and Star is { tar }₁，tar₂，...，tar_|Star|}，tar₁Representing that the 1 st attribute association rule in the triplet set Star completes the entity-attribute triples, | Star | is the number of attribute association rule completion entity-attribute triples contained in the triplet set Star;

s245, adding all entity-attribute triples in the attribute association rule completion entity-attribute triplet set Star into the knowledge graph to complete the completion of the knowledge graph based on the attribute association rule;

further, in step S3, the entity, the attribute, the entity relationship, and the attribute relationship in the knowledge graph after the first completion are respectively converted into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector, and an attribute relationship embedding vector, and the specific process is as follows:

randomly initializing the entity into a vector with a dimension of k by utilizing uniform distribution, and then carrying out standardization processing on the vector obtained by random initialization to obtain an entity embedded vector;

obtaining an attribute embedded vector, an entity relationship embedded vector and an attribute relationship embedded vector in the same way;

further, the definition of the scoring function on the entity-entity triplets is:

wherein,

representing entity-entity triplets (e)_i，re_l，e_j)_eThe score function of (a) above (b),

respectively represent entities e_i，e_jThe corresponding entity is embedded in the vector and,

representing entity relationships re_lThe corresponding entity relationship is embedded into the vector,

represents a1 norm or a2 norm;

the definition of the loss function on an entity-entity triplet is:

wherein L is_eRepresenting entity-entity triplets (e)_i，re₁，e_j)_eA loss function of (a) wherein (e)_i，re_l，e_j)_eE Te, Te denotes the set of entity-entity triples, (e)_i′，re_l，e_j′)_eIndicating random substitution (e)_i，re_l，e_j)_eE in (a)_iOr e is_jThe resulting replaced entity-entity triplet, (e)_i′，re_l，e_j′)_eE Te ', Te' represents the set of entity-entity triples after replacement,

denotes e_i' the corresponding entity embeds the vector(s),

denotes e_j' corresponding entity embedding vector, γ is an edge hyper-parameter, [ X ]]₊Indicates that if X is a number of 0 or more, [ X ]]₊Is X, otherwise [ X]₊Is 0;

the definition of the score function on the entity-attribute triple and the loss function on the entity-attribute triple are the same;

further, the training of the entity-entity triplet by using the TransE model training algorithm specifically includes:

randomly sampling b entity-entity triples from the entity-entity triple set of the knowledge graph after the first completion, then carrying out negative sampling on the sampled entity-entity triples, namely randomly replacing head entities in the entity-entity triples with other entities or tail entities with other entities, forming negative sampling entity-entity triples by the replaced entity-entity triples, and then adding the corresponding entity-entity triples and the negative sampling entity-entity triple combinations corresponding to the entity-entity triples into the set T;

if 3 triples A, B, C are sampled, the negative sampling entity-entity triples corresponding to a are a1, a2, A3, … and AM, respectively, the a, a1, a2, A3, … and AM are combined into a set T, and the loss function of the combination of the a, a1, a2, A3, … and AM is calculated; the negative sampling entity-entity triples corresponding to B are respectively B1, B2, B3, … and BM, the combinations of B and B1, B2, B3, … and BM are combined into a set T, and the loss function of the combinations of B and B1, B2, B3, … and BM is calculated; the negative sampling entity-entity triples corresponding to C are respectively C1, C2, C3, … and CM, the combinations of C and C1, C2, C3, … and CM are combined into a set T, and loss functions of the combinations of C and C1, C2, C3, … and CM are calculated;

calculating loss function values of each combination in the set T in sequence, and updating parameters of the TransE model by adopting a gradient descent method after calculating the loss function value of each combination;

the method for training the entity-attribute triples is the same as the method;

after the parameters of the TransE model are updated, obtaining a trained TransE model, and outputting an entity embedded vector, an attribute embedded vector, an entity relationship embedded vector and an attribute relationship embedded vector which are trained by utilizing the trained TransE model;

further, the second completion of the knowledge graph is performed by using an entity prediction algorithm, an attribute prediction algorithm and a relationship prediction algorithm, and the specific process is as follows:

s331, complementing the knowledge graph by using an entity prediction algorithm

The input of the entity prediction algorithm comprises a flag, a head entity embedding vector

Or tail entity embedded vector

Or attribute embedded vector

Entity relationship embedded vector

Or attribute relationship embedded vector

Entity embedding vector set

Outputting a triple embedding vector candidate set C and a triple embedding vector candidate diversity S;

the entity prediction algorithm respectively carries out head entity prediction of the entity-entity triple, tail entity prediction of the entity-entity triple and head entity prediction of the entity-attribute triple;

for head entity prediction on entity-entity triplets, each trained entity embedding vector is associated with a given trained tail entity embedding vector

And entity relationship embedding vector

Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity embedding vectors with the given entity relationship embedding vectors

And tail entity embedding vector

Embedding the composed entity-entity triple into the vector candidate;

for tail entity prediction on entity-entity triplets, each trained entity embedding vector is associated with a given trained head entity embedding vector

And entity relationship embedding vector

And head entity embedding vector

Embedding the composed entity-entity triple into the vector candidate;

for head entity prediction on entity-attribute triplets, each trained entity embedding vector is associated with a given trained attribute embedding vector

And attribute relation embedding vector

Combining, calculating the score of each combination by using a score function on the entity-attribute triples, and storing the traversed entity embedding vector and the given attribute relation embedding vector

And attribute embedded vector

Embedding the composed entity-attribute triples into vector candidates;

the sort () function in the entity prediction algorithm arranges the scores of the triples in ascending order, adds the first N triples in the order to the knowledge graph, and completes the completion of the knowledge graph by using the entity prediction algorithm;

the earlier the triple score ordering is, the higher the possibility of correctness of the triple is, so that the earlier in the triple score ordering is extracted, and the value of N in the invention is 1426 and is converted into an entity-entity triple or an entity-attribute triple. Finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete the completion of the knowledge graph by utilizing entity prediction;

s332, performing knowledge graph completion by using attribute prediction algorithm

Specifically, the inputs to the attribute prediction algorithm include the head entity embedding vector

Attribute relationship embedded vector

Attribute embedded vector set

The output includes entity-attribute triplet embedding vector candidate set C and entity-attribute triplet embedding vector candidate set S.

An attribute prediction algorithm combines each trained attribute embedding vector with a given trained head entity embedding vector

And attribute relation embedding vector

Combining, calculating the score of each combination by using a score function on the entity-attribute triple, and storing the traversed attribute embedding vector and the given attribute relation embedding vector

And head entity embedding vector

Embedding the composed entity-attribute triples into vector candidates;

sort () function in the attribute prediction algorithm arranges the scores of the entity-attribute triples in ascending order, adds the first N triples in the order to the knowledge graph, and completes the completion of the knowledge graph by using the attribute prediction algorithm;

the more top-ranked triples in the entity-attribute triple scores are more likely to be formed, so the top-ranked triples in the entity-attribute triple scores are extracted and converted into entity-attribute triples. Finally, adding the newly obtained entity-attribute triple into the knowledge graph to complete the completion of the knowledge graph by utilizing attribute prediction;

s333, complementing the knowledge graph by using a relation prediction algorithm

Specifically, the inputs of the relational prediction algorithm include flag, header entity embedding vector

Tail entity embedding vector

Or attribute embedded vector

Entity relationship embedded vector

Or attribute relationship embedded vector

Entity relationship embedding vector set

Or attribute relationship embedding vector set

The output includes a triplet embedded vector candidate set C and a triplet embedded vector candidate set S. The relationship prediction algorithm will perform entity relationship prediction and attribute relationship prediction, respectively.

For entity relationship prediction, each trained entity relationship embedding vector is associated with a given trained head entity embedding vector

And tail entity embedding vector

Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity-relationship embedding vector with the given head embedding vector

And tail entity embedding vector

Embedding the composed entity-entity triple into the vector candidate;

for attribute relationship prediction, each trained attribute relationship embedding vector is associated with a given trained head entity embedding vector

And attribute embedded vector

Combining and calculating a score for each combination using a scoring function on the entity-attribute triplets, and then storing the traversed attribute relationship embedding vector with the given head entity embedding vector

And attribute embedded vector

Embedding the composed entity-attribute triples into vector candidates;

and (3) arranging the scores of the triples in an ascending order by a sort () function in the relational prediction algorithm, adding the first N triples in the order to the knowledge graph, and completing the completion of the knowledge graph by using the relational prediction algorithm.

The more top ranked triples in the triple score are more likely to be formed, and therefore the top ranked triples in the triple score are extracted and converted into entity-entity triples or entity-attribute triples. And finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete knowledge graph completion by utilizing relationship prediction.

The invention has the beneficial effects that: the invention provides a rule and embedded knowledge graph completion method. By the method, the knowledge graph completion method with high accuracy and good usability can be provided.

By applying the provided knowledge graph complementing method, the school social network knowledge graph is complemented and analyzed, hidden school violence events can be found, and further the occurrence of the school violence events is prevented and reduced.

Drawings

FIG. 1 is an exemplary rule-based knowledge graph completion diagram;

FIG. 2 is a schematic diagram of a knowledge graph spectrum completion process based on embedding;

FIG. 3 is an exemplary graph of embedded knowledge-graph-based completion.

Detailed Description

The rule and embedded knowledge graph-based completion method provided by the invention can be applied to knowledge graph completion of school social networks, and latent campus violence events can be predicted based on the completed school social network knowledge graph. The specific implementation process is as follows:

s1, defining the relation triple in the knowledge graph of the school social network, and the specific process is as follows:

s11 in the knowledge graph for defining the social network of the schoolThe entity set, the attribute set, the entity relationship set, and the attribute relationship set. Entity set in knowledge graph of school social network is E ═ E₁，e₂，...，e_|E|Denotes wherein e₁，e₂，...，e_|E|Representing entities in a school social network; attribute set a ═ a₁，a₂，...，a_|A|Denotes wherein a₁，a₂，...，a_|A|Representing attributes in a school social network; the relationship set is represented by R ═ { Re, Ra }, where Re ═ { Re ═ Re }₁，re₂，...，re_|Re|Denotes the entity relationship in the school social network, Ra ═ Ra₁，ra₂，...，ra_|Ra|Denotes attribute relationships in the school social network.

Specifically, the present embodiment represents an entity set with E ═ zhangsan, lie four, wangwu,. · a. }; a. } represents an attribute set; a. } represents a set of physical relationships; ra. · represents a set of attribute relationships.

S12, defining entity-entity triples based on the entities in the entity sets and the entity relationships in the entity relationship sets. Entity-entity triplet te ═ (e)_i，re_k，e_j)_eTe ∈ Te, wherein e_i，e_jE represents an entity in the knowledge graph of the school social network, E_iRepresenting a head entity, e_jRepresents a tail entity, re_kE.re represents the entity relationship in the knowledge graph of the school social network, Te ═ Te₁，te₂，...，te_|Te|Represents the set of entity-entity triplets of the school social network.

Specifically, the present embodiment uses entity-entity triplets (Zhang three, friend, Li four)_eAnd the friend showing Zhang three is Li four.

And S13, defining the entity-attribute triple based on the entity in the entity set, the attribute in the attribute set and the attribute relationship in the attribute relationship set. Entity-attribute triple ta ═ (e)_i，ra_k，a_j)_aTa is an element of Ta tableIn which e_iE represents an entity in the knowledge graph of the school social network, E_iRepresenting a head entity, a_je.A represents an attribute in the knowledge graph of the school social network, a_jRepresenting the tail attribute, ra_kE.g. Ra represents the attribute relationship in the knowledge graph of the school social network, Ta ═ Ta₁，ta₂，...，ta_|Ta|Represents the set of entity-attribute triplets of the school social network.

Specifically, the present embodiment uses entity-attribute triplets (Zhang Sanjia, Fender, Framing)_aAnd the place where Zhang III receives is divided into a frame.

S2, performing first completion on the knowledge graph based on four relation rules, wherein the specific process is as follows:

and S21, completing the entity-entity trigram set of the school social network based on the transmission rule. First, a delivery rule is defined as

Wherein e_i，e_j，e_kE denotes the entity, re_l，re_m，re_nAnd e.g. Re represents entity relationship, and all specific delivery rules which accord with the delivery rule form are extracted from the knowledge graph of the school social network. Specifically, in this embodiment, the specific delivery rule extracted from the knowledge graph may be

Secondly, extracting the key information of a specific delivery rule to form a delivery rule candidate tc (re) ═ tc_l，re_m，re_n)_cWherein re_l，re_m，re_nE Re represents the entity relationship. Specifically, in the present embodiment, for the specific delivery rule, the specific delivery rule candidate that is configured by extracting the key information therein is tc₁Becoming (classmate )_cAnd tc₂Becoming (with table, friend, classmate)_c. All delivery rule candidates constitute a delivery rule candidate set Stc.

Third, for tc₁Becoming (classmate )_cIf e is₁On the same table is₂And e₂Is e₃It is obvious that e can be obtained₁Is of the same school as e₃It means that the same table entity relationship and the same class entity relationship are passed through out the same school entity relationship, hence tc₁Is a correct delivery rule candidate. But for tc₂Becoming (with table, friend, classmate)_cIf e is₄On the same table is₅And e₅Is e₆Obviously, e is not necessarily obtained₄The classmates are e₆And therefore tc₂Is an erroneous delivery rule candidate. Each delivery rule candidate needs to be filtered to determine its correctness. Filtering each transmission rule candidate tc by using a transmission rule candidate filtering algorithm to obtain a transmission rule correct candidate trc, where trc is (re)_l，re_m，re_n)_rWherein re₁，re_m，re_nE Re represents the entity relationship. All delivery rule correct candidates constitute a delivery rule correct candidate set Strc.

Wherein, the input of the transmission rule candidate filtering algorithm comprises an entity-entity ternary set Te, a transmission rule candidate set Stc and a threshold value tau₁And τ₂The output is the delivery rule correct candidate set Strc. In the algorithm, the number of simultaneous occurrences of the first two entity relationships in each transfer rule candidate is first obtained by using a Function1(), for example, for a specific transfer rule candidate tc₁Becoming (classmate )_cFirst, a count n is initialized to 0 if two entity-entity triplets (e) occur simultaneously in the knowledge-graph of the school social network₁Same table, e₂)_eAnd (e)₂In class, e₃)_eThen the count n is incremented by 1 and the step is repeated until the situation is found where all of the first two entity relationships in the delivery rule candidate occur and the count n is updated, and finally the Function1() returns the count n. Secondly, the number of the transmission rule candidates is used as the score of one transmission rule candidate, and the ratio of the number of the transmission rule candidates to the number of the two previous entity relations occurring simultaneously is used as the score of one transmission rule candidateThe step is repeated to calculate the score of each transfer rule candidate. Finally, using a threshold τ₁And threshold τ₂Each delivery rule candidate is filtered. First, the number of candidate occurrences is determined using each delivery rule and a threshold τ₁And if the previous value is larger, continuing the subsequent threshold comparison. Secondly, the score of each transfer rule candidate is compared with a threshold value tau₂In contrast, if the previous value is larger, the piece of delivery rule candidate is added to the delivery rule correct candidate set Strc. And a threshold τ₁The purpose of comparison is to confirm that the number of candidates for a certain delivery rule satisfies a certain condition, and the threshold value tau₂The purpose of the comparison is to confirm the correctness of a certain delivery rule candidate. In this way, each delivery rule candidate is filtered by using the delivery rule candidate filtering algorithm, so as to obtain a correct delivery rule candidate set Strc.

In this embodiment, the threshold value may be set according to actual conditions.

Specifically, in the present embodiment, tc is a candidate according to a specific delivery rule₁And tc₂After algorithm filtering, a correct candidate tc of the transfer rule can be obtained₁Becoming (classmate )_c。tc₂Are screened out because they are incorrect.

And fourthly, obtaining a transfer rule completion entity-entity triple tet by utilizing two known entity-entity triples in a knowledge graph of the social network of the school and the corresponding transfer rule correct candidate trc. Specifically, in this embodiment, two known entity-entity triplets (Zhang three, same table, Wang five) in the knowledge-graph of the school social network are used_eAnd (king five, same class, small red)_eAnd delivery of the rule correct candidate trc₁Becoming (classmate )_rTo obtain a new entity-entity triplet (Zhang three, classmate, little red)_e. All the transmission rule completion entity-entity triples form a transmission rule completion entity-entity triplet set Stet, Stet ═ tet₁，tet₂，...，tet_|Stet|}。

And finally, adding all entity-entity triples in the transfer rule completion entity-entity triplet set Stet into the knowledge graph of the school social network, and completing the knowledge graph completion of the school social network based on the transfer rule.

And S22, completing the entity-entity triad set based on the antisymmetric rule. First, the antisymmetric rule is defined as

Wherein e_i，e_jE denotes the entity, re_l，re_mAnd e.g. Re represents entity relationship, and all the specific antisymmetric rules which accord with the antisymmetric rule form are extracted from the knowledge graph of the school social network. In particular, in this embodiment, the specific antisymmetric rule extracted from the knowledge graph of the school social network may be

Secondly, extracting key information of a specific antisymmetric rule to form an antisymmetric rule candidate ac, wherein ac is (re)_l，re_m)_cWherein re_l，re_mAnd epsilon Re represents the entity relationship. Specifically, in the present embodiment, for the specific antisymmetric rule, the antisymmetric rule candidate formed by extracting the key information therein is ac₁Becoming (teacher, student)_cAnd ac₂Becoming (friend, classmate)_c. All antisymmetric rule candidates constitute an antisymmetric rule candidate set Sac.

Third, for any antisymmetric rule candidate, its correctness is also undetermined. For ac₁Becoming (teacher, student)_cIf e is₁Teacher of (a) is e₂It is obvious that e can be obtained₂Is e₁Represents an antisymmetry between teacher entity relationships and student entity relationships, and thereforeac₁Is a correct antisymmetric rule candidate. But for ac₂Becoming (friend, classmate)_cIf e is₃Is e₄Friend of (e)₄Is not necessarily e₃Classmate of (1), thus ac₂Is a wrong antisymmetric rule candidate, each antisymmetric rule candidate needs to be filtered to determine its correctness. Filtering each antisymmetric rule candidate ac by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate arc, wherein the arc is equal to (re)_l，re_m)_rWherein re_l，re_mE Re represents the entity relationship. All antisymmetric rule correct candidates constitute an antisymmetric rule correct candidate set, Sarc.

Wherein the inputs to the antisymmetric rule candidate filtering algorithm include an entity-entity triplet Te, an antisymmetric rule candidate Sac, and a threshold τ₃And τ₄The output is the anti-symmetric rule correct candidate set, Sarc. The algorithm first obtains the number of occurrences of each entity relationship in each antisymmetric rule candidate in the knowledge-graph using the Function2 (). As for a particular antisymmetric rule candidate ac₁Becoming (teacher, student)_cFirst, two counts n are set₁And n₂Respectively representing the number of two entity relations in one antisymmetric rule candidate appearing in the knowledge graph, and n is₁And n₂The initial setting is 0. If an entity-entity triple (e) appears in the knowledge-graph₁Teacher, e₂)_eThen n will be counted₁And adding 1. If an entity-entity triple (e) appears in the knowledge-graph₃Student, e₄)_eThen n will be counted₂Adding 1, repeating the steps until the number of all occurrences of two entity relations in the antisymmetric rule candidate is found, and updating the count n₁And n₂Finally Function2() returns the count n₁And n₂. Secondly, calculating the ratio of the number of occurrences of each anti-symmetric rule candidate to the number of occurrences of two entity relations in the anti-symmetric rule candidate, taking the two ratios as two scores of the anti-symmetric rule candidate, and repeatingThis step calculates two scores for each anti-symmetric rule candidate. Finally, using a threshold τ₃And threshold τ₄Each antisymmetric rule candidate is filtered. First, the number of candidate occurrences is determined using each anti-symmetric rule and a threshold τ₃And comparing, and continuing the subsequent threshold comparison if the previous value is larger. Secondly, the threshold values tau are respectively set₄Comparing the two scores of each anti-symmetric candidate if both scores are greater than the threshold τ₄Then the antisymmetric rule candidate is added to the antisymmetric rule correct candidate set, sarco. And a threshold τ₃The purpose of the comparison is to confirm that the number of antisymmetric rule candidates satisfies a certain condition, and the threshold value tau₄The purpose of the comparison is to confirm the correctness of a certain antisymmetric rule candidate. In this way, each antisymmetric rule candidate is filtered by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate set Sarc.

Specifically, in the present embodiment, ac candidates are based on the antisymmetric rule₁And ac₂Filtering to obtain the correct candidate arc with antisymmetric rule₁Becoming (teacher, student)_r，ac₂And is screened out because of being incorrect.

And fourthly, obtaining the antisymmetric rule completion entity-entity triple tea by utilizing the known entity-entity triple in the knowledge graph of the social network of the school and the corresponding antisymmetric rule correct candidate arc thereof. Specifically, in this embodiment, entity-entity triplets (Zhang three, teacher, Sunjie) known in the knowledge-graph of the school social network are used_eAnd anti-symmetric rule correct candidate arc₁Becoming (teacher, student)_rGet a new entity-entity triplet (Sun Jie, student, Zhang three)_e. All the antisymmetric rule completion entity-entity triplets form an antisymmetric rule completion entity-entity triplet set Stea, Stea ═ tea₁，tea₂，...，tea_|stea|}。

And finally, adding all entity-entity triples in the entity-entity triplet set Stea into the knowledge graph of the school social network by the aid of the antisymmetric rule completion to complete knowledge graph completion based on the antisymmetric rule.

And S23, completing the entity-entity triad set based on the entity association rule. First, an entity association rule is defined as

Wherein e_i，e_j，e_k∈E∧e_j≠e_kRepresents an entity, re_l，re_mAnd e.g. Re represents the entity relationship, and all concrete entity association rules which accord with the entity association rule form are extracted from the knowledge graph of the school social network. Specifically, in this embodiment, the specific entity association rule extracted from the knowledge graph of the school social network may be

Secondly, extracting key information of a specific entity association rule to form an entity association rule candidate rc (re)_l，e_j，re_m，e_k)_crWherein e is_j，e_kE denotes the entity, re_l，re_mE Re represents the entity relationship. Specifically, in the present embodiment, the entity association rule candidate formed by extracting the key information corresponding to the specific entity association rule is rc₁Becoming two (teacher, Sun Jie, classmate, Zhao six)_crAnd rc₂Becoming (friend, Li Si, friend, Zheng Qi)_cr. All entity association rule candidates constitute an entity association rule candidate set Src.

Third, for rc₁Becoming two (teacher, Sun Jie, classmate, Zhao six)_crIf e is₁If the teacher is Sunjie, then e can be obtained₁The classmate of Zhao Liu, because the teacher of Zhao Liu is Sun Jie, rc₁Is a correct entity association rule candidate. But if rc is changed₁In the direction of association of, i.e. rc₁' (Tongxiao, Zhao Liu, teacher, Sun Jie)_crThen it can be obtained if e₂Classmate of Zhao Liu, then e₂Is Sunjie, this is clearly incorrect because if e is₂Classmate of Zhao Liu, then e₂The teacher of (1) is not necessarily grandma, but may be other teachers as well. For rc₂Becoming (friend, Li Si, friend, Zheng Qi)_crThe association from either direction is erroneous. Therefore, for each entity association rule candidate, not only its correctness but also the direction of association needs to be confirmed. Filtering each entity association rule candidate rc by using an entity association rule candidate filtering algorithm to obtain an entity association rule correct candidate rrc (re ═_l，e_j，re_m，e_k)_rrWherein e is_j，e_kE denotes the entity, re_l，re_mE Re represents the entity relationship. All the correct candidates for the entity association rule constitute a correct candidate set Srrc for the entity association rule.

Wherein, the input of the entity association rule candidate filtering algorithm comprises an entity-entity ternary set Te, an entity association rule candidate set Src and a threshold tau₅And τ₆And outputting the correct candidate set Srrc of the entity association rule. In the algorithm, firstly, the Function3() is used to obtain the number of the entity relations in each entity association rule candidate appearing with each entity combination respectively and the number of the entity relations appearing with two entity combinations simultaneously. E.g. associating rule candidates rc for an entity₁Becoming two (teacher, Sun Jie, classmate, Zhao six)_crFirst, three counts n are set₁、n₂、n₃Respectively representing the number of the first entity relation and the corresponding entity combination, the number of the second entity relation and the corresponding entity combination and the number of the two entity relations and the corresponding entity combination in the specific entity association rule candidates, and combining n₁、n₂、n₃The initial setting is 0. If an entity-entity triple (e) appears in the knowledge-graph₁Teacher, Sun Jie)_eThen n will be counted₁Plus 1, if an entity-entity triple (e) appears in the knowledge-graph₂College of studentsZhao six)_eThen n will be counted₂Plus 1, if two entity-entity triplets (e) occur at the same time₃Teacher, Sun Jie)_eAnd (e)₃Classmate, Zhao Liu)_eThen n will be counted₃And adding 1. Repeating the steps until the number of two entity relations respectively appearing with the corresponding entity combination and the number of two entity relations appearing with the corresponding entity combination in the entity association rule candidate are found, and updating the count n₁、n₂、n₃Finally Function3() returns the count n₁、n₂、n₃。

Secondly, the number of occurrences of a certain entity association rule candidate is n₃The number of the first entity relation in the candidate and the corresponding entity combination is n₁The number of the second entity relation in the candidate and the corresponding entity combination is n₂The ratio is calculated n₃And n₁Ratio of (a) to (b), n₃And n₂The ratio of (a) to (b).

And taking the two ratios as two scores of the entity association rule candidate, and repeating the step to calculate two scores of each entity association rule candidate. Finally, using a threshold τ₅And threshold τ₆And filtering each entity association rule candidate. First, the number of occurrences of each entity association rule candidate is related to a threshold τ₅And if the previous value is larger, continuing the subsequent threshold comparison.

Secondly, the two scores of the entity association rule candidate are related to a threshold value tau₆By contrast, if both scores are greater than the threshold τ₆Then the subsequent comparison continues.

And comparing the two scores of the entity association rule candidate, and if the first score is smaller than the second score, exchanging the association direction of the entity association rule candidate by using a Swap () function, otherwise, not exchanging. As for entity association rule candidates rc₁Becoming two (teacher, Sun Jie, classmate, Zhao six)_crThe Swap () function exchanges the association direction, rc, of the entity association rule candidate₁Switched to rc₁' (Tongxiao, Zhao Liu, teacher, Sun Jie)_cr. And finally, adding the entity association rule candidate into an entity association rule correct candidate set Srrc. And a threshold τ₅The purpose of comparison is to confirm that the number of entity association rule candidates satisfies a certain condition, and the threshold τ₆The purpose of comparing is to confirm the correctness of the entity association rule candidate, and the purpose of comparing the two scores of the entity association rule candidate is to judge the association direction. Thus, entity association rule candidates are filtered by using an entity association rule filtering algorithm, and a correct entity association rule candidate set Srrc is obtained.

Specifically, in the present embodiment, rc is candidate according to the entity association rule₁And rc₂Filtering to obtain the correct candidate rrc of the specific entity association rule₁Becoming two (teacher, Sun Jie, classmate, Zhao six)_rr，re₂And is screened out because of being incorrect.

And fourthly, obtaining an entity association rule completion entity-entity triple by utilizing the known entity-entity triple in the knowledge graph and the corresponding entity association rule correct candidate rrc. Specifically, in this embodiment, entity-entity triplets (Zhang three, teacher, Sunjie) known in the knowledge-graph are used_eCorrect candidate rrc with entity association rule₁Becoming two (teacher, Sun Jie, classmate, Zhao six)_rrGet the new entity-entity triple (Zhang three, classmate, Zhao six)_e. All entity association rules complement entity-entity triples to form an entity association rule complement entity-entity triplet set Ster, Ster being { ter }₁，ter₂，...，ter_|Ster|}。

And finally, adding all entity-entity triples in the entity-entity triplet set Ster to the knowledge graph by the entity association rule completion, and completing the knowledge graph completion based on the entity association rule.

And S24, completing the entity-attribute triad set based on the attribute association rule. First, an attribute association rule is defined as

Wherein e_iE denotes an entity, a_j，a_k∈A∧a_j≠a_kRepresents an attribute, ra_l，ra_mAnd e.g. Ra represents the attribute relationship, and all specific attribute association rules which accord with the attribute association rule form are extracted from the knowledge graph of the school social relationship. Specifically, in this embodiment, the specific attribute association rule extracted from the knowledge graph of the social relationship of the school may be

Secondly, extracting key information of specific attribute association rule to form attribute association rule candidate atrc (ra)_l，a_j，ra_m，a_k)_carWherein a is_j，a_k∈A∧a_j≠a_kRepresents an attribute, ra_l，ra_mE.g. Ra represents an attribute relationship. Specifically, in the present embodiment, the attribute association rule candidate that is configured by extracting key information therein in accordance with the above-described attribute association rule is atrc₁Either because of the difference in score or rank_carAnd atrc₂Good result (punishment, nothing)_car. All the attribute association rule candidates constitute an attribute association rule candidate set Satrc.

Third, for atrc₁Either because of the difference in score or rank_carIf e is₁At the position where the frame is over-strutted, e₁Must be poor, so atrc₁Is a correct attribute association rule candidate. But if alter atrc₁The direction of association of, i.e. atrc₁' (score, difference, punishment, putting on shelf)_carThen it can be obtained if e₂If the result of (D) is poor, e₂The part which is not always subjected to the frame is not subjected to the frame, and the part which is not subjected to the frame or is subjected to other parts. For atrc₂Good result (punishment, nothing)_carFrom whichever sideThe association is erroneous. Therefore, it is necessary to confirm not only the correctness but also the direction of association for any one attribute association rule candidate. Filtering each attribute association rule candidate atrc by using an attribute association rule candidate filtering algorithm to obtain an attribute association rule correct candidate arrc (ra) ((ra))_l，a_j，ra_m，a_k)_aarWherein a is_j，a_k∈A∧a_j≠a_kRepresents an attribute, ra₁，ra_mE.g. Ra represents an attribute relationship.

The input of the attribute association rule candidate filtering algorithm comprises an entity-attribute ternary set Ta, an attribute association rule candidate set Satrc and a threshold tau₇And τ₈The output of the algorithm is the correct candidate set of attribute association rules, Sarrc. In the algorithm, firstly, the Function4() is used to obtain the number of the two attribute relations in each attribute association rule candidate appearing with the combination of the attribute relations and the attribute relations appearing with the combination of the attribute relations. Such as associating a rule candidate, atrc, for a particular attribute₁Either because of the difference in score or rank_carFirst, three counts n are set₁、n₂、n₃Respectively representing the number of the combinations of two attribute relations and the respective attributes in the attribute association rule candidates and the number of the combinations of two attribute relations and the respective attributes appearing simultaneously, and dividing n₁、n₂、n₃The initial setting is 0. If an entity-attribute triple (e) appears in the knowledge-graph₁Punishment and support)_aThen n will be counted₁Plus 1, if an entity-attribute triple (e) appears in the knowledge-graph₂Achievement, difference)_aThen n will be counted₂Plus 1, if two entity-attribute triplets (e) occur simultaneously₃Punishment and support)_aAnd (e)₃Achievement, difference)_aThen n will be counted₃And adding 1. Repeating the steps until the number of the two attribute relations in the attribute association rule candidate and the combination of the two attribute relations and the respective attributes are foundThe number of epochs, and update the count n₁、n₂、n₃Finally Function4() returns the count n₁、n₂、n₃. Secondly, calculating the ratio of the number of the occurrences of each attribute association rule candidate to the number of the occurrences of the two attribute relations in the candidate and the respective attribute combination, taking the two ratios as two scores of the attribute association rule candidate, and repeating the step to calculate the two scores of each attribute association rule candidate. Finally, using a threshold τ₇And threshold τ₈And filtering each attribute association rule candidate. First, the number of occurrences of a rule candidate is associated with a threshold τ using an attribute₇And comparing, and continuing the subsequent threshold comparison if the previous value is larger. Secondly, the ratio of the number of the candidate occurrences of the attribute association rule to the number of the two attribute relations and the respective attribute combination of the candidate occurrences at the same time is utilized to be equal to the threshold value tau₈And comparing, and continuing to perform subsequent comparison if the previous value is larger.

And comparing the two scores of the attribute association rule candidate, and exchanging the association direction of the attribute association rule candidate by using a Swap () function if the first score is smaller than the second score, otherwise, not exchanging. Associating rule candidates for attributes₁Either because of the difference in score or rank_carAfter exchange becomes atrc₁' (score, difference, punishment, putting on shelf)_car. Finally, the attribute association rule candidate is added to the correct attribute association rule candidate set Sarrc. And a threshold τ₇The purpose of the comparison is to confirm that the number of candidates for a specific attribute association rule satisfies a certain condition, and the threshold τ₈The purpose of the comparison is to confirm the correctness of a specific attribute association rule candidate, and the purpose of the comparison of the two scores of the attribute association rule is to determine the direction of association. In this way, the attribute association rule candidate is filtered by using the attribute association rule candidate filtering algorithm, so as to obtain the correct attribute association rule candidate set Sarrc.

Specifically, in the present embodiment, the rule candidate atrc is associated according to the attribute₁And atrc₂Can be obtained after filtrationAttribute association rule correct candidate arrc₁Either because of the difference in score or rank_arr，atrc₂And is screened out because of being incorrect. All correct candidates for the attribute association rule constitute a correct candidate set of attribute association rules, Sarrc.

And fourthly, obtaining attribute association rules to complement the entity-entity triple tar by utilizing the known entity-entity triple in the knowledge graph and the attribute association rule correct candidate arrc corresponding to the entity-entity triple. Specifically, in this embodiment, the entity-attribute triple (Zhang three, Branch, put frame) a and the attribute association rule correct candidate arrc known in the knowledge graph are used₁Either because of the difference in score or rank_arrGet the new entity-attribute triple (Zhang three, achievement, poor)_a. All attribute association rule completion entity-entity triples form an attribute association rule completion entity-entity triplet set Star, Star ═ tar₁，tar₂，...，tar_|Star|}。

And finally, adding all entity-entity triples in the attribute association rule completion entity-entity triplet set Star into the knowledge graph to complete the knowledge graph completion based on the attribute association rule.

After completion of the rule-based knowledge graph as shown in fig. 1, a first completed school social network knowledge graph can be obtained, in fig. 1, the completed entity-entity triples are represented by thick solid lines, and the completed entity-attribute triples are represented by thick dotted lines. The three friends of the third Zhang can be analyzed through the existing entity-entity triples and entity-attribute triples, and the complemented entity-entity triples and entity-attribute triples, namely the four Blasdri six friend of the third Zhang, the three Zhang and the six Zhao are classmates, the three Zhang is placed at a place where the third Zhang is placed, the possibility that the three Zhang has the Blashao six can be inferred through the relationship between the characters of the three Zhang and the six Zhao and the attributes of the characters of the three Zhang, and whether the behavior of the Zhao six really exists or not can be investigated according to the inference.

And S3, as shown in figure 2, based on the knowledge graph completed for the first time, initializing the defined relation triples in the knowledge graph into embedded vectors which are easy to process by a machine, then performing model training on the embedded vectors, and finally performing a knowledge graph completion task based on the embedded knowledge graph by using the trained embedded vectors to obtain a predicted new triplet group to perform second completion on the knowledge graph. The specific process is as follows:

and S31, converting the entity, the attribute, the entity relation and the attribute relation in the knowledge graph into an embedded vector, and defining a score function and a loss function on the entity-entity triple and the entity-attribute triple based on a TransE model. Specifically, the entity-embedded vector set is defined as E ═ E₁，e₂，...，e_|E|In which e₁，e₂，...，e_|E|Representing an entity e₁，e₂，...，e_|E|Embedding the corresponding entity into a vector; the attribute embedded vector set is defined as a ═ a₁，a₂，...，a_|A|In which a is₁，a₂，...，a_|A|Represents an attribute a₁，a₂，...，a_|A|Embedding the corresponding attribute into a vector; the entity relationship embedding vector set is defined as Re ═ Re₁，re₂，...，re_|Re|Where re₁，re₂，...，re_|Re|Representing entity relationships re₁，re₂，...，re_|Re|Embedding the corresponding entity relationship into a vector; the attribute relation embedding vector set is defined as Ra ═ Ra₁，ra₂，...，ra_|Ra|Where ra₁，ra₂，...，ra_|Ra|Representing the attribute relationship ra₁，ra₂，...，ra_|Ra|And embedding the corresponding attribute relation into the vector.

In the embodiment, two types of triples, namely, an entity-entity triple and an entity-attribute triple, are defined, and therefore, a scoring function for respectively judging the correctness of the two types of triples is required. The scoring function on an entity-entity triple is defined as

Wherein e_i，e_jE denotes the entity embedding vector, re_le.Re represents an entity relationship embedding vector, L₁|L₂Representing either a 1-norm or a 2-norm. Embedding a vector e in a formula by an entity_iAdding entity relationship embedding vector re_lThen subtract the entity embedding vector e_jAnd calculating the value of 1 norm or 2 norm to measure the correctness of the entity-entity triple. The smaller the value calculated by the score function, the smaller e_iPlus re_lAnd e_jThe closer the distance between the entities, the more likely the entity-entity triplet exists, and vice versa. The scoring function on an entity-attribute triple is defined as

Wherein e_iE denotes the entity embedding vector, a_je.A represents the attribute embedding vector, ra_lE Ra represents an attribute relationship embedding vector, L₁|L₂Representing either a 1-norm or a 2-norm. Embedding a vector e in a formula by an entity_iEmbedding vector ra by adding attribute relation_lThen subtract the attribute embedding vector a_jAnd calculating a value obtained by 1 norm or 2 norm to measure the correctness of the entity-attribute triple, wherein the smaller the value is, the higher the possibility of existence of the entity-attribute triple is, and the smaller the value is, otherwise, the smaller the possibility is.

The TransE model trains the embedded vectors by minimizing an edge-based penalty. The loss function on an entity-entity triple is defined as

Wherein (e)_i，re_l，e_j)_eEntity-entity triplets, e.g., Te_i′，re_l，e_j′)_eE Te' represents random substitution (e)_i，re₁，e_j)_eE in (a)_iOr e is_jThe resulting entity-entity triplet, γ, is an edge hyper-parameter, [ X [ ]]₊This indicates that if X is a number equal to or greater than 0, the result is that number, otherwise the result is 0. The objective of the minimization of the loss function is to make the scores of correct entity-entity triples as small as possible, while the scores of incorrect entity-entity triples are as large as possible, so that the correctness of an entity-entity triplet can be measured correctly. The loss function on an entity-attribute triple is defined as

Wherein (e)_i，ra_l，a_j)_aE.Ta represents an entity-attribute triple, (e)_i′，ra_l，a_j′)_aEpsilon Ta' denotes random substitution (e)_i，ra_l，a_j)_aE in (a)_iOr a_jThe resulting entity-attribute triplet, γ, is an edge hyper-parameter, [ X ]]₊This indicates that if X is a number equal to or greater than 0, the result is that number, otherwise the result is 0. The objective of minimizing the loss function is to make the score of the correct entity-attribute triple as small as possible, and the score of the incorrect entity-attribute triple as large as possible, so that the correctness of an entity-attribute triple can be correctly measured.

And S32, performing embedded model training on the entity-entity triple by using a TransE model training algorithm based on the defined score function and the defined loss function to obtain a trained embedded vector.

Specifically, the TransE model training algorithm on the entity-entity triples randomly initializes entity relationship embedded vectors to vectors with a dimension k by utilizing uniform distribution, and then standardizes each entity relationship embedded vector. Second, the entity embedding vector is randomly initialized to a vector of dimension k using uniform distribution. And thirdly, standardizing each entity embedding vector, randomly sampling the entity-entity triples with the number b from the entity-entity triplet set, then carrying out negative sampling on the sampled entity-entity triples, namely randomly replacing the head entity or the tail entity of the entity-entity triples as other entities to form negative sampling entity-entity triples, and then adding the corresponding entity-entity triples and the corresponding negative sampling entity-entity triplet combinations into the set T. Finally, loss calculation is performed on each combination in the set T, and parameters are updated.

The parameters are updated by gradient descent of the losses, e.g. for entity-entity triplets (e)₁，re₁，e₂)_eAnd its corresponding negative sampling entity-entity triplet (e)₁′，re₁，e₂′)_eThe gradient descent is represented by

Wherein theta is_iDenoted is a parameter, α denotes a learning rate, and J () denotes a loss function. The gradient decrease of TransE, which can be obtained according to the gradient decrease formula, is shown in the following five formulas:

take formula (1) as an example, simplify L_e(e₁，re₁，e₂，e₁′，e₂') the following equation (6) results from the sigma summation:

L_e(e₁，re₁，e₂，e₁′，e₂′)＝[f_e(e₁+re₁，e₂)-f_e(e₁′+re₁，e₂′)+γ]₊ (6)

combining equation (1) and equation (6) yields the following equation (7):

wherein if f_e(e₁+re₁，e₂)-f_e(e₁′+re₁，e₂') + gamma is less than or equal to 0, the parameter is not updated. If the calculated value is greater than 0, then the partial derivative to e1 is calculated, for example, calculating a 1-norm, which ultimately results in a similarity [1, 1, 1, -1, ·]And finally updating the parameters by using the formula (1). α in the formulas (1) to (5) represents a learning rate, and the purpose of multiplying the learning rate in the formulas is to prevent zigzag oscillation from being formed, resulting in non-convergence. The updating of each parameter of each round is carried out in such a way that the value calculated by the scoring function of the correct entity-entity triplet is as small as possible and the value calculated by the scoring function of the wrong entity-entity triplet is as large as possible in the updating process, so that the overall loss tends to 0. Updating process of other parameters and e₁The update process of (2) is similar.

And S33, performing embedded model training on the entity-attribute triples by using a TransE model training algorithm based on the defined score function and loss function to obtain the trained embedded vectors, wherein the implementation method is the same as S32. After the whole model training is finished, an entity embedded vector set, an entity relationship embedded vector set, an attribute embedded vector set and an attribute relationship embedded vector set can be obtained.

And S34, performing entity prediction, relationship prediction and attribute prediction on the knowledge graph based on the trained entity embedded vector set, attribute embedded vector set, entity relationship embedded vector set and attribute relationship embedded vector set to perform second completion. The specific process is as follows:

and S341, complementing the knowledge graph by using an entity prediction algorithm. Specifically, the input of the entity prediction algorithm includes flag, head entity embedding vector e_iOr tail entity embedding vector e_jOr attribute embedding vector a_kEntity relationship embedded vector re_lOr attribute relation embedding vector ra_mEntity embedded vector set E. The output includes a triplet embedded vector candidate set C and a triplet embedded vector candidate set S. The entity prediction algorithm respectively carries out head entity prediction of the entity-entity triples, tail entity prediction of the entity-entity triples and head entity prediction of the entity-attribute triples.

For head entity prediction on entity-entity triplets, each entity embedding vector is associated with a given tail entity embedding vector e_jAnd entity relation embedding vector re_lAnd combining and calculating the score of each combination by using a score function on the entity-entity triples.

For tail entity prediction on entity-entity triplets, each entity embedding vector is associated with a given head entity embedding vector e_iAnd entity relation embedding vector re_lAnd combining and calculating the score of each combination by using a score function on the entity-entity triples.

For head entity prediction on entity-attribute triplets, each entity embedding vector is associated with a given attribute embedding vector a_kAnd attribute relation embedding vector ra_mAnd combining and calculating the score of each combination by using a score function on the entity-attribute triples.

And (3) sort () functions in the entity prediction algorithm are arranged according to the scores of the triples in an ascending order, and the higher the ranking of the triples is, the higher the possibility that the triples are correct is, so that the triples ranked earlier in the triples scores are extracted and converted into entity-entity triples or entity-attribute triples. And finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete the completion of the knowledge graph by utilizing entity prediction.

And S342, performing knowledge graph completion by using an attribute prediction algorithm. Specifically, the input to the attribute prediction algorithm includes the head entity embedding vector e_iAttribute relation embedding vector ra_mAnd embedding the attributes into the vector set A. The output includes entity-attribute triplet embedding vector candidate set C and entity-attribute triplet embedding vector candidate set S. Attribute prediction Each attribute embedding vector with a given head entity embedding vector e_iAnd attribute relation embedding vector ra_mAnd combining and calculating the score of each combination by using a score function on the entity-attribute triples.

The sort () function in the attribute prediction algorithm is arranged according to the scores of the entity-attribute triples in an ascending order, and the probability that the triples in the scores of the entity-attribute triples which are ranked more forward are more likely to be established, so that the triples in the scores of the entity-attribute triples which are ranked more forward are extracted and converted into the entity-attribute triples. And finally, adding the newly obtained entity-attribute triple into the knowledge graph to complete the completion of the knowledge graph by utilizing attribute prediction.

And S343, completing the knowledge graph by using a relation prediction algorithm. Specifically, the inputs of the relational prediction algorithm include flag, header entity embedding vector e_iTail entity embedded vector e_jOr attribute embedding vector a_kEntity relationship embedded vector re_lOr attribute relation embedding vector ra_mEntity relation embedding vector set Re or attribute relation embedding vector set Ra. The output includes a triplet embedded vector candidate set C and a triplet embedded vector candidate set S. The relationship prediction algorithm will perform entity relationship prediction and attribute relationship prediction, respectively.

For entity relationship prediction, each entity relationship embedding vector is associated with a given head entity embedding vector e_iAnd tail entity embedding vector e_jAnd combining and calculating the score of each combination by using a score function on the entity-entity triples.

For attribute relationship prediction, each attribute relationship is embedded intoQuantity and given head entity embedding vector e_iAnd attribute embedding vector a_kAnd combining and calculating the score of each combination by using a score function on the entity-attribute triples.

The sort () function in the relational prediction algorithm is arranged according to the scores of the triples in an ascending order, and the probability that the triples with the higher order in the triples scores are established is higher, so that the triples with the higher order in the triples scores are extracted and converted into entity-entity triples or entity-attribute triples. And finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete knowledge graph completion by utilizing relationship prediction.

The method carries out two times of completion work on the knowledge graph based on the rule and the embedded mode, so that the information in the knowledge graph is richer and more accurate.

As shown in fig. 3, a second embedded-based school social network knowledge graph completion is performed on the first rule-based completed school social network knowledge graph, where the bold circle represents the predicted leading or trailing entity, the bold solid line represents the predicted entity relationship, and the bold dashed line represents the predicted attribute relationship. For entity-entity triplets, head entity prediction is performed such as entity-entity triplets (minired, friend, Xiaoming)_eThe head entity in the middle is "little red", and the tail entity predicts as entity-entity triple (Sun Jie, student, king five)_eThe "Wangwu" entity in the middle, entity relationship prediction such as entity-entity triple (Zhengqi, classmate, Zhao Liu)_eThe entity relationship in (1) 'classmate'; for entity-attribute triplets, head entity prediction is performed, such as entity-attribute triplets (Xiaoming, punishment, fighting)_aThe head entity in (1) is Xiaoming, and the attribute prediction is like entity-attribute triple (Li Si, punishment, fighting)_aIn (3) attribute "fighting", prediction of attribute relationship is as entity-attribute triplets (Zhengqi, punishment, fighting)_aThe attribute relationship in (1) 'pun'. The existing entity-entity triples and entity-attribute triples and the complemented entity-entity triples and entity-attribute triples of Zhengqi, Li Si Ba Zhao Ling Liu, Zhengqi, and ZhengqiZhengqi and ZhaoLiu are classmates, while Zhengqi is hit by the punting place, and the possibility of Zhengqi existing in ZhaoLiu can be inferred through the relationship between Zhengqi and ZhaoLiu and the attributes of Zhengqi.

By complementing the knowledge graph of the social network of the school in the embodiment through a rule-based and embedded knowledge graph complementing method, the possibility that Zhang III, Li IV and Zhengqi may have Zhang Ling Zhao Liu can be analyzed, and through the presumption, the fact that whether three people really have Zhang Ling Zhao Liu can be further investigated, so that corresponding measures are taken to avoid the deterioration of the Zhang Ling situation.

The above-described calculation examples of the present invention are merely to explain the calculation model and the calculation flow of the present invention in detail, and are not intended to limit the embodiments of the present invention. It will be apparent to those skilled in the art that other variations and modifications of the present invention can be made based on the above description, and it is not intended to be exhaustive or to limit the invention to the precise form disclosed, and all such modifications and variations are possible and contemplated as falling within the scope of the invention.

Claims

1. A rule and embedded knowledge graph-based completion method is characterized by specifically comprising the following steps:

and completing the completion of the knowledge graph after performing secondary completion on the knowledge graph by using an entity prediction algorithm, an attribute prediction algorithm and a relation prediction algorithm.

2. The method for supplementing knowledge-graph based on rules and embedding according to claim 1, wherein the specific process of step S1 is as follows:

and S13, defining the entity-attribute triple based on the entity in the entity set, the attribute in the attribute set and the attribute relationship in the attribute relationship set.

3. The rule-and-embedding-based knowledge graph completion method according to claim 2, wherein the completion of the entity-entity triplet is performed by using the delivery rule according to the following specific process:

s211, the transmission rule is defined as

s212, extracting the key information of the specific delivery rule to form a delivery rule candidate tc (re)₁，re_m，re_n)_cAll the transfer rule candidates constitute a transfer rule candidate set Stc;

s214, obtaining a transfer rule completion entity-entity triple tet by using two known entity-entity triples in the knowledge graph and the corresponding transfer rule correct candidate trc, wherein the transfer rule completion entity-entity triples form a transfer rule completion entity-entity triple set Stet, Stet { [ tet ]₁，tet₂，...，tet_|Stet|}，tet₁Representing that the 1 st transmission rule in the three-tuple set Stet completes the entity-entity triplets, | Stet | is the number of the transmission rule contained in the three-tuple set Stet to complete the entity-entity triplets;

s215, adding all entity-entity triples in the transfer rule completion entity-entity triplet set Stet into the knowledge graph, and completing the knowledge graph completion based on the transfer rule.

4. The rule-based and embedded knowledge-graph completion method according to claim 3, wherein the completion of the entity-entity triples by using the antisymmetric rule is carried out by the following specific processes:

s221, the antisymmetric rule is defined as

Wherein e is_i，e_jRepresenting an entity, e_i，e_jE, E represents the set of entities, re_l，re_mRepresenting entity relationships, re₁，re_mThe element belongs to Re, the Re represents an entity relation set, and all specific antisymmetric rules which accord with the antisymmetric rule form are extracted from the knowledge graph;

s222, extracting key information of specific antisymmetric rules to form antisymmetric rule candidate ac, wherein ac is (re)₁，re_m)_cAll antisymmetric rule candidates constitute an antisymmetric rule candidate set Sac;

s225, adding all entity-entity triples in the entity-entity triplet set Stea to the knowledge graph by the aid of the anti-symmetric rules, and completing knowledge graph completion based on the anti-symmetric rules.

5. The rule-and-embedded-based knowledge graph completion method according to claim 4, wherein the completion of the entity-entity triplet is performed by using the entity association rule, and the specific process is as follows:

s231, defining the entity association rule as

s233, filtering each entity association rule candidate rc by using the entity association rule candidate filtering algorithm to obtain an entity association rule correct candidate rrc (re ═ rrc)_l，e_j，re_m，e_k)_rrAll the correct entity association rule candidates form an entity association rule correct candidate set Srrc;

s234, obtaining an entity association rule completion entity-entity triple ter by using the known entity-entity triple in the knowledge graph and the entity association rule correct candidate rrc corresponding to the entity association rule triple, wherein the entity association rule completion entity-entity triple constitutes an entity association rule completion entity-entity triple set Ster which is { ter }₁，ter₂，...，ter_|Ster|}，ter₁Representing that the 1 st entity association rule in the three-tuple set Ster completes the entity-entity triples, | Ster | is the number of the entity association rule completion entity-entity triples contained in the three-tuple set Ster;

s235, adding all entity-entity triples in the entity-entity triple set Ster to the knowledge graph according to the entity association rule completion, and completing the knowledge graph completion based on the entity association rule.

6. The rule and embedded-based knowledge graph completion method according to claim 5, wherein the completion of the entity-attribute triples by using the attribute association rule comprises the following specific processes:

s241, defining the attribute association rule as

s243, filtering each attribute association rule candidate atrc by using the attribute association rule candidate filtering algorithm to obtain an attribute association rule correct candidate arrc, wherein arrc is (ra)_l，a_j，Fa_m，a_k)_aarAll the correct attribute association rule candidates form an attribute association rule correct candidate set Sarrc;

s245, adding all entity-attribute triples in the attribute association rule completion entity-attribute triplet set Star into the knowledge graph, and completing the completion of the knowledge graph based on the attribute association rule.

7. The method according to claim 6, wherein in step S3, the entity, the attribute, the entity relationship and the attribute relationship in the first completed knowledge graph are respectively converted into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector and an attribute relationship embedding vector, and the specific process is as follows:

and similarly, obtaining an attribute embedded vector, an entity relationship embedded vector and an attribute relationship embedded vector.

8. The rule-based and embedded knowledge-graph completion method of claim 7, wherein the definition of the scoring function on the entity-entity triplets is:

wherein,

represents a1 norm or a2 norm;

the definition of the loss function on an entity-entity triplet is:

wherein L is_eRepresenting entity-entity triplets (e)_i，re_l，e_j)_eA loss function of (a) wherein (e)_i，re_l，e_j)_eE Te, Te denotes the set of entity-entity triples, (e)_i′，re_l，e_j′)_eIndicating random substitution (e)_i，re_l，e_j)_eE in (a)_iOr e is_jThe resulting replaced entity-entity triplet, (e)_i′，re_l，e_j′)_eE Te ', Te' represents the set of entity-entity triples after replacement,

denotes e_i' the corresponding entity embeds the vector(s),

the definition of the scoring function on the entity-attribute triplets and the loss function on the entity-attribute triplets are the same.

9. The method according to claim 8, wherein the entity-entity triples are trained using a TransE model training algorithm, which comprises:

randomly sampling b entity-entity triples from the entity-entity triple set of the knowledge graph after the first completion, then carrying out negative sampling on the entity-entity triples obtained by sampling, namely randomly replacing head entities in the entity-entity triples with other entities or tail entities with other entities, forming negative sampling entity-entity triples by the replaced entity-entity triples, and then adding the entity-entity triples and the negative sampling entity-entity triple combinations corresponding to the entity-entity triples into the set T;

the method of training entity-attribute triples is the same as above.

10. The rule and embedding based knowledge graph completion method according to claim 9, wherein the knowledge graph is completed for the second time by using an entity prediction algorithm, an attribute prediction algorithm and a relationship prediction algorithm, and the specific process is as follows:

s331, complementing the knowledge graph by using an entity prediction algorithm

And entity relationship embedding vector

Combining, and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity embedding vectorEmbedding vectors of quantities with given entity relationships

And tail entity embedding vector

Embedding the composed entity-entity triple into the vector candidate;

And entity relationship embedding vector

And head entity embedding vector

Embedding the composed entity-entity triple into a vector candidate;

Sum attribute relationship embedded vector

Combining, and calculating scores for each combination using a scoring function on entity-attribute triplets, and then storing the traversed entitiesEmbedding vector with given attribute relation