CN112818134A - Knowledge graph completion method based on rules and embedding - Google Patents

Knowledge graph completion method based on rules and embedding Download PDF

Info

Publication number
CN112818134A
CN112818134A CN202110197370.9A CN202110197370A CN112818134A CN 112818134 A CN112818134 A CN 112818134A CN 202110197370 A CN202110197370 A CN 202110197370A CN 112818134 A CN112818134 A CN 112818134A
Authority
CN
China
Prior art keywords
entity
attribute
rule
triples
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110197370.9A
Other languages
Chinese (zh)
Inventor
李晋
朴贤姬
项金鹏
程建华
白玉
王春波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202110197370.9A priority Critical patent/CN112818134A/en
Publication of CN112818134A publication Critical patent/CN112818134A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A rule and embedding based knowledge graph completion method belongs to the technical field of campus overlord behavior prediction. The invention solves the problem that the existing research method can not simultaneously give consideration to both accuracy and usability. The method comprises the steps of firstly defining relation triples in the knowledge graph, carrying out first completion on the knowledge graph based on four relation rules, and then carrying out second completion on the knowledge graph based on the knowledge graph subjected to first completion by using an embedded knowledge graph completion method. By the method, the knowledge graph completion method with high accuracy and good usability can be provided. The method can be applied to campus rabdosia behavior prediction.

Description

Knowledge graph completion method based on rules and embedding
Technical Field
The invention belongs to the technical field of campus tyrant behavior prediction, and particularly relates to a rule and embedding-based knowledge graph completion method.
Background
With the rapid development of society, internet technology has also advanced rapidly, and more individuals and organizations use the internet to store and calculate hundreds of millions of data with associations. This development has also led to an explosive increase in data on the internet. How to represent the large-scale relationship data more completely also becomes a hot spot of research at home and abroad. The knowledge graph is an intuitive, efficient and accurate knowledge representation mode suitable for processing a large amount of multi-relation data, and large-scale relation data in the Internet can be visually represented by applying the knowledge graph. Although the knowledge graph brings many benefits to people, the missing relationship in the knowledge graph also brings many problems to practical application. Complementing the knowledge graph is to perform knowledge reasoning based on the existing knowledge graph, that is, to reason implicit relationship data according to knowledge in the existing knowledge graph and add the inferred relationship data to the existing knowledge graph. The completion of the knowledge graph can help people to mine more useful information, has the function of non-wear-out on the subsequent data analysis work, and is also an important direction of the current domestic and foreign research.
The conventional knowledge-graph complete-supplement research is mainly divided into a rule-based knowledge-graph complete-supplement method and an embedded knowledge-graph complete-supplement method, wherein the rule-based knowledge-graph complete-supplement method has the advantages of higher accuracy and stronger interpretability, but has poorer generalization capability and higher complexity. The latter has the advantage of better usability and scalability, but has the disadvantage of less accuracy and less interpretability. Therefore, the existing research methods cannot simultaneously achieve high accuracy and good usability.
Disclosure of Invention
The invention aims to solve the problem that the existing research method cannot simultaneously consider both accuracy and usability, and provides a rule and embedding-based knowledge graph complementing method.
The technical scheme adopted by the invention for solving the technical problems is as follows: a rule and embedded knowledge graph-based completion method specifically comprises the following steps:
step S1, defining relation triples in the knowledge graph, wherein the relation triples comprise entity-entity triples and entity-attribute triples;
s2, complementing the entity-entity triples by using the transfer rule, the anti-symmetric rule and the entity association rule, and complementing the entity-attribute triples by using the attribute association rule to obtain a knowledge graph subjected to first complementing;
step S3, converting the entity, attribute, entity relationship and attribute relationship in the knowledge graph after the first completion into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector and an attribute relationship embedding vector respectively;
taking the entity embedding vector, the attribute embedding vector, the entity relationship embedding vector and the attribute relationship embedding vector as the input of a TransE model, defining a score function on an entity-entity triple, a loss function on the entity-entity triple, a score function on the entity-attribute triple and a loss function on the entity-attribute triple, and training the entity-entity triple and the entity-attribute triple by using a TransE model training algorithm to obtain the trained entity embedding vector, attribute embedding vector, entity relationship embedding vector and attribute relationship embedding vector;
completing the completion of the knowledge graph after performing secondary completion on the knowledge graph by using an entity prediction algorithm, an attribute prediction algorithm and a relation prediction algorithm;
further, the specific process of step S1 is as follows:
s11, all entities in the knowledge graph form an entity set, all attributes form an attribute set, all entity relations form an entity relation set, and all attribute relations form an attribute relation set;
s12, defining an entity-entity triple based on the entity in the entity set and the entity relationship in the entity relationship set;
s13, defining an entity-attribute triple based on the entity in the entity set, the attribute in the attribute set and the attribute relationship in the attribute relationship set;
further, the completing the entity-entity triplet by using the transfer rule includes:
s211, the transmission rule is defined as
Figure BDA0002947523010000021
Wherein e isi,ej,ekRepresenting an entity, ei,ej,ekE, E represents the set of entities, rel,rem,renRepresenting entity relationships, rel,rem,renE.g. Re, Re represents the set of entity relationships, (. DEG)eRepresenting entity-entity triples, and extracting all specific transfer rules which accord with the form of the transfer rules from the knowledge graph;
s212, extracting the key information of the specific delivery rule to form a delivery rule candidate tc (re)l,rem,ren)cAll the transfer rule candidates constitute a transfer rule candidate set Stc;
s213, filtering each transmission rule candidate tc by using the transmission rule candidate filtering algorithm to obtain a transmission rule correct candidate trc (re ═l,rem,ren)rAll the correct candidates of the transfer rule form a correct candidate set Strc of the transfer rule;
s214, obtaining a transfer rule completion entity-entity triple tet by using two known entity-entity triples in the knowledge graph and the corresponding transfer rule correct candidate trc, wherein the transfer rule completion entity-entity triples form a transfer rule completion entity-entity triple set Stet, Stet ═ tet1,tet2,...,tet|Stet|},tet1Representing that the 1 st transmission rule in the three-tuple set Stet completes the entity-entity triplets, | Stet | is the number of the transmission rule contained in the three-tuple set Stet to complete the entity-entity triplets;
s215, adding all entity-entity triples in the transfer rule completion entity-entity triplet set Stet into the knowledge graph to complete the knowledge graph completion based on the transfer rule;
further, the completion of the entity-entity triplet by using the antisymmetric rule includes the following specific processes:
s221, the antisymmetric rule is defined as
Figure BDA0002947523010000031
Wherein e isi,ejRepresenting an entity, ei,ejE, E represents the set of entities, rel,remRepresenting entity relationships, rel,remThe element belongs to Re, the Re represents an entity relation set, and all specific antisymmetric rules which accord with the antisymmetric rule form are extracted from the knowledge graph;
s222, extracting key information of specific antisymmetric rules to form antisymmetric rule candidate ac, wherein ac is (re)l,rem)cAll antisymmetric rule candidates constitute an antisymmetric rule candidate set Sac;
s223, filtering each antisymmetric rule candidate ac by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate arc, wherein the arc is (re)l,rem)rAll antisymmetric rule correct candidates constitute an antisymmetric rule correct candidate set Sarc;
s224, obtaining an antisymmetric rule completion entity-entity triplet tea by using the known entity-entity triplets in the knowledge graph and the corresponding antisymmetric rule correct candidate arc, where all antisymmetric rule completion entity-entity triplets form an antisymmetric rule completion entity-entity triplet Stea, and Stea { (tea) }1,tea2,...,tea|Stea|},tea1Representing the 1 st antisymmetric rule completion entity-entity triple in the three-tuple set Stea, wherein | Stea | is the number of antisymmetric rule completion entity-entity triples contained in the three-tuple set Stea;
s225, adding all entity-entity triples in the entity-entity triplet set Stea to the knowledge graph to complete the knowledge graph completion based on the antisymmetric rules;
further, the completing the entity-entity triple by using the entity association rule includes the following specific processes:
s231, defining the entity association rule as
Figure BDA0002947523010000032
Wherein e isi,ej,ekE and Ej≠ek,ei,ej,ekRepresenting an entity, E representing a set of entities, rel,rem∈Re,rel,remRepresenting entity relations, Re representing an entity relation set, and extracting all concrete entity association rules which accord with the entity association rule form from the knowledge graph;
s232, extracting key information of the specific entity association rule to form an entity association rule candidate rc (re ═ re)l,ej,rem,ek)crAll entity association rule candidates form an entity association rule candidate set Src;
s233, filtering each entity association rule candidate rc by using the entity association rule candidate filtering algorithm to obtain an entity association rule correct candidate rrc (re ═ rrc)1,ej,rem,ek)rrAll the correct entity association rule candidates form an entity association rule correct candidate set Srrc;
s234, obtaining an entity association rule completion entity-entity triple ter by using the known entity-entity triple in the knowledge graph and the entity association rule correct candidate rrc corresponding to the entity association rule triple, wherein the entity association rule completion entity-entity triple constitutes an entity association rule completion entity-entity triple set Stet, and Ster is { ter } entity-entity triple1,ter2,...,ter|Ster|},ter1Representing that the 1 st entity association rule in the three-tuple set Ster completes the entity-entity triples, | Ster | is the number of the entity association rule completion entity-entity triples contained in the three-tuple set Ster;
s235, adding all entity-entity triples in the entity-entity triplet set Ster to the knowledge graph to complete the knowledge graph completion based on the entity association rule;
further, the completion of the entity-attribute triple by using the attribute association rule includes the following specific processes:
s241, defining the attribute association rule as
Figure BDA0002947523010000041
Wherein ei∈E,eiRepresenting an entity, E representing a set of entities, aj,akIs epsilon of A and aj≠ak,aj,akRepresenting attributes, A representing a set of attributes, ral,ram∈Ra,ral,ramRepresents the attribute relationship, Ra represents the set of attribute relationships, (-)aRepresenting entity-attribute triples, and extracting all specific attribute association rules which accord with the attribute association rule form from the knowledge graph;
s242, extracting key information of the specific attribute association rule to form an attribute association rule candidate atrc (ra)1,aj,ram,ak)carAll the attribute association rule candidates form an attribute association rule candidate set Satrc;
s243, filtering each attribute association rule candidate atrc by using the attribute association rule candidate filtering algorithm to obtain an attribute association rule correct candidate arrc, wherein arrc is (ra)l,aj,ram,ak)aarAll the correct attribute association rule candidates form an attribute association rule correct candidate set Sarrc;
s244, obtaining attribute association rule completion entity-attribute triple tar by using the known entity-attribute triple in the knowledge graph and the attribute association rule correct candidate arrc corresponding to the known entity-attribute triple, wherein all the attribute association rule completion entity-attribute triples form an attribute association rule completion entity-attribute triple set Star, and Star is { tar }1,tar2,...,tar|Star|},tar1Representing that the 1 st attribute association rule in the triplet set Star completes the entity-attribute triples, | Star | is the number of attribute association rule completion entity-attribute triples contained in the triplet set Star;
s245, adding all entity-attribute triples in the attribute association rule completion entity-attribute triplet set Star into the knowledge graph to complete the completion of the knowledge graph based on the attribute association rule;
further, in step S3, the entity, the attribute, the entity relationship, and the attribute relationship in the knowledge graph after the first completion are respectively converted into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector, and an attribute relationship embedding vector, and the specific process is as follows:
randomly initializing the entity into a vector with a dimension of k by utilizing uniform distribution, and then carrying out standardization processing on the vector obtained by random initialization to obtain an entity embedded vector;
obtaining an attribute embedded vector, an entity relationship embedded vector and an attribute relationship embedded vector in the same way;
further, the definition of the scoring function on the entity-entity triplets is:
Figure BDA0002947523010000051
wherein,
Figure BDA0002947523010000052
representing entity-entity triplets (e)i,rel,ej)eThe score function of (a) above (b),
Figure BDA0002947523010000053
respectively represent entities ei,ejThe corresponding entity is embedded in the vector and,
Figure BDA0002947523010000054
representing entity relationships relThe corresponding entity relationship is embedded into the vector,
Figure BDA0002947523010000055
represents a1 norm or a2 norm;
the definition of the loss function on an entity-entity triplet is:
Figure BDA0002947523010000056
wherein L iseRepresenting entity-entity triplets (e)i,re1,ej)eA loss function of (a) wherein (e)i,rel,ej)eE Te, Te denotes the set of entity-entity triples, (e)i′,rel,ej′)eIndicating random substitution (e)i,rel,ej)eE in (a)iOr e isjThe resulting replaced entity-entity triplet, (e)i′,rel,ej′)eE Te ', Te' represents the set of entity-entity triples after replacement,
Figure BDA0002947523010000057
denotes ei' the corresponding entity embeds the vector(s),
Figure BDA0002947523010000058
denotes ej' corresponding entity embedding vector, γ is an edge hyper-parameter, [ X ]]+Indicates that if X is a number of 0 or more, [ X ]]+Is X, otherwise [ X]+Is 0;
the definition of the score function on the entity-attribute triple and the loss function on the entity-attribute triple are the same;
further, the training of the entity-entity triplet by using the TransE model training algorithm specifically includes:
randomly sampling b entity-entity triples from the entity-entity triple set of the knowledge graph after the first completion, then carrying out negative sampling on the sampled entity-entity triples, namely randomly replacing head entities in the entity-entity triples with other entities or tail entities with other entities, forming negative sampling entity-entity triples by the replaced entity-entity triples, and then adding the corresponding entity-entity triples and the negative sampling entity-entity triple combinations corresponding to the entity-entity triples into the set T;
if 3 triples A, B, C are sampled, the negative sampling entity-entity triples corresponding to a are a1, a2, A3, … and AM, respectively, the a, a1, a2, A3, … and AM are combined into a set T, and the loss function of the combination of the a, a1, a2, A3, … and AM is calculated; the negative sampling entity-entity triples corresponding to B are respectively B1, B2, B3, … and BM, the combinations of B and B1, B2, B3, … and BM are combined into a set T, and the loss function of the combinations of B and B1, B2, B3, … and BM is calculated; the negative sampling entity-entity triples corresponding to C are respectively C1, C2, C3, … and CM, the combinations of C and C1, C2, C3, … and CM are combined into a set T, and loss functions of the combinations of C and C1, C2, C3, … and CM are calculated;
calculating loss function values of each combination in the set T in sequence, and updating parameters of the TransE model by adopting a gradient descent method after calculating the loss function value of each combination;
the method for training the entity-attribute triples is the same as the method;
after the parameters of the TransE model are updated, obtaining a trained TransE model, and outputting an entity embedded vector, an attribute embedded vector, an entity relationship embedded vector and an attribute relationship embedded vector which are trained by utilizing the trained TransE model;
further, the second completion of the knowledge graph is performed by using an entity prediction algorithm, an attribute prediction algorithm and a relationship prediction algorithm, and the specific process is as follows:
s331, complementing the knowledge graph by using an entity prediction algorithm
The input of the entity prediction algorithm comprises a flag, a head entity embedding vector
Figure BDA0002947523010000061
Or tail entity embedded vector
Figure BDA0002947523010000062
Or attribute embedded vector
Figure BDA0002947523010000063
Entity relationship embedded vector
Figure BDA0002947523010000064
Or attribute relationship embedded vector
Figure BDA0002947523010000065
Entity embedding vector set
Figure BDA0002947523010000066
Outputting a triple embedding vector candidate set C and a triple embedding vector candidate diversity S;
the entity prediction algorithm respectively carries out head entity prediction of the entity-entity triple, tail entity prediction of the entity-entity triple and head entity prediction of the entity-attribute triple;
for head entity prediction on entity-entity triplets, each trained entity embedding vector is associated with a given trained tail entity embedding vector
Figure BDA0002947523010000067
And entity relationship embedding vector
Figure BDA0002947523010000068
Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity embedding vectors with the given entity relationship embedding vectors
Figure BDA0002947523010000069
And tail entity embedding vector
Figure BDA00029475230100000610
Embedding the composed entity-entity triple into the vector candidate;
for tail entity prediction on entity-entity triplets, each trained entity embedding vector is associated with a given trained head entity embedding vector
Figure BDA0002947523010000071
And entity relationship embedding vector
Figure BDA0002947523010000072
Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity embedding vectors with the given entity relationship embedding vectors
Figure BDA0002947523010000073
And head entity embedding vector
Figure BDA0002947523010000074
Embedding the composed entity-entity triple into the vector candidate;
for head entity prediction on entity-attribute triplets, each trained entity embedding vector is associated with a given trained attribute embedding vector
Figure BDA0002947523010000075
And attribute relation embedding vector
Figure BDA0002947523010000076
Combining, calculating the score of each combination by using a score function on the entity-attribute triples, and storing the traversed entity embedding vector and the given attribute relation embedding vector
Figure BDA0002947523010000077
And attribute embedded vector
Figure BDA0002947523010000078
Embedding the composed entity-attribute triples into vector candidates;
the sort () function in the entity prediction algorithm arranges the scores of the triples in ascending order, adds the first N triples in the order to the knowledge graph, and completes the completion of the knowledge graph by using the entity prediction algorithm;
the earlier the triple score ordering is, the higher the possibility of correctness of the triple is, so that the earlier in the triple score ordering is extracted, and the value of N in the invention is 1426 and is converted into an entity-entity triple or an entity-attribute triple. Finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete the completion of the knowledge graph by utilizing entity prediction;
s332, performing knowledge graph completion by using attribute prediction algorithm
Specifically, the inputs to the attribute prediction algorithm include the head entity embedding vector
Figure BDA0002947523010000079
Attribute relationship embedded vector
Figure BDA00029475230100000710
Attribute embedded vector set
Figure BDA00029475230100000711
The output includes entity-attribute triplet embedding vector candidate set C and entity-attribute triplet embedding vector candidate set S.
An attribute prediction algorithm combines each trained attribute embedding vector with a given trained head entity embedding vector
Figure BDA00029475230100000712
And attribute relation embedding vector
Figure BDA00029475230100000713
Combining, calculating the score of each combination by using a score function on the entity-attribute triple, and storing the traversed attribute embedding vector and the given attribute relation embedding vector
Figure BDA00029475230100000714
And head entity embedding vector
Figure BDA00029475230100000715
Embedding the composed entity-attribute triples into vector candidates;
sort () function in the attribute prediction algorithm arranges the scores of the entity-attribute triples in ascending order, adds the first N triples in the order to the knowledge graph, and completes the completion of the knowledge graph by using the attribute prediction algorithm;
the more top-ranked triples in the entity-attribute triple scores are more likely to be formed, so the top-ranked triples in the entity-attribute triple scores are extracted and converted into entity-attribute triples. Finally, adding the newly obtained entity-attribute triple into the knowledge graph to complete the completion of the knowledge graph by utilizing attribute prediction;
s333, complementing the knowledge graph by using a relation prediction algorithm
Specifically, the inputs of the relational prediction algorithm include flag, header entity embedding vector
Figure BDA0002947523010000081
Tail entity embedding vector
Figure BDA0002947523010000082
Or attribute embedded vector
Figure BDA0002947523010000083
Entity relationship embedded vector
Figure BDA0002947523010000084
Or attribute relationship embedded vector
Figure BDA0002947523010000085
Entity relationship embedding vector set
Figure BDA0002947523010000086
Or attribute relationship embedding vector set
Figure BDA0002947523010000087
The output includes a triplet embedded vector candidate set C and a triplet embedded vector candidate set S. The relationship prediction algorithm will perform entity relationship prediction and attribute relationship prediction, respectively.
For entity relationship prediction, each trained entity relationship embedding vector is associated with a given trained head entity embedding vector
Figure BDA0002947523010000088
And tail entity embedding vector
Figure BDA0002947523010000089
Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity-relationship embedding vector with the given head embedding vector
Figure BDA00029475230100000810
And tail entity embedding vector
Figure BDA00029475230100000811
Embedding the composed entity-entity triple into the vector candidate;
for attribute relationship prediction, each trained attribute relationship embedding vector is associated with a given trained head entity embedding vector
Figure BDA00029475230100000812
And attribute embedded vector
Figure BDA00029475230100000813
Combining and calculating a score for each combination using a scoring function on the entity-attribute triplets, and then storing the traversed attribute relationship embedding vector with the given head entity embedding vector
Figure BDA00029475230100000814
And attribute embedded vector
Figure BDA00029475230100000815
Embedding the composed entity-attribute triples into vector candidates;
and (3) arranging the scores of the triples in an ascending order by a sort () function in the relational prediction algorithm, adding the first N triples in the order to the knowledge graph, and completing the completion of the knowledge graph by using the relational prediction algorithm.
The more top ranked triples in the triple score are more likely to be formed, and therefore the top ranked triples in the triple score are extracted and converted into entity-entity triples or entity-attribute triples. And finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete knowledge graph completion by utilizing relationship prediction.
The invention has the beneficial effects that: the invention provides a rule and embedded knowledge graph completion method. By the method, the knowledge graph completion method with high accuracy and good usability can be provided.
By applying the provided knowledge graph complementing method, the school social network knowledge graph is complemented and analyzed, hidden school violence events can be found, and further the occurrence of the school violence events is prevented and reduced.
Drawings
FIG. 1 is an exemplary rule-based knowledge graph completion diagram;
FIG. 2 is a schematic diagram of a knowledge graph spectrum completion process based on embedding;
FIG. 3 is an exemplary graph of embedded knowledge-graph-based completion.
Detailed Description
The rule and embedded knowledge graph-based completion method provided by the invention can be applied to knowledge graph completion of school social networks, and latent campus violence events can be predicted based on the completed school social network knowledge graph. The specific implementation process is as follows:
s1, defining the relation triple in the knowledge graph of the school social network, and the specific process is as follows:
s11 in the knowledge graph for defining the social network of the schoolThe entity set, the attribute set, the entity relationship set, and the attribute relationship set. Entity set in knowledge graph of school social network is E ═ E1,e2,...,e|E|Denotes wherein e1,e2,...,e|E|Representing entities in a school social network; attribute set a ═ a1,a2,...,a|A|Denotes wherein a1,a2,...,a|A|Representing attributes in a school social network; the relationship set is represented by R ═ { Re, Ra }, where Re ═ { Re ═ Re }1,re2,...,re|Re|Denotes the entity relationship in the school social network, Ra ═ Ra1,ra2,...,ra|Ra|Denotes attribute relationships in the school social network.
Specifically, the present embodiment represents an entity set with E ═ zhangsan, lie four, wangwu,. · a. }; a. } represents an attribute set; a. } represents a set of physical relationships; ra. · represents a set of attribute relationships.
S12, defining entity-entity triples based on the entities in the entity sets and the entity relationships in the entity relationship sets. Entity-entity triplet te ═ (e)i,rek,ej)eTe ∈ Te, wherein ei,ejE represents an entity in the knowledge graph of the school social network, EiRepresenting a head entity, ejRepresents a tail entity, rekE.re represents the entity relationship in the knowledge graph of the school social network, Te ═ Te1,te2,...,te|Te|Represents the set of entity-entity triplets of the school social network.
Specifically, the present embodiment uses entity-entity triplets (Zhang three, friend, Li four)eAnd the friend showing Zhang three is Li four.
And S13, defining the entity-attribute triple based on the entity in the entity set, the attribute in the attribute set and the attribute relationship in the attribute relationship set. Entity-attribute triple ta ═ (e)i,rak,aj)aTa is an element of Ta tableIn which eiE represents an entity in the knowledge graph of the school social network, EiRepresenting a head entity, aje.A represents an attribute in the knowledge graph of the school social network, ajRepresenting the tail attribute, rakE.g. Ra represents the attribute relationship in the knowledge graph of the school social network, Ta ═ Ta1,ta2,...,ta|Ta|Represents the set of entity-attribute triplets of the school social network.
Specifically, the present embodiment uses entity-attribute triplets (Zhang Sanjia, Fender, Framing)aAnd the place where Zhang III receives is divided into a frame.
S2, performing first completion on the knowledge graph based on four relation rules, wherein the specific process is as follows:
and S21, completing the entity-entity trigram set of the school social network based on the transmission rule. First, a delivery rule is defined as
Figure BDA0002947523010000101
Wherein ei,ej,ekE denotes the entity, rel,rem,renAnd e.g. Re represents entity relationship, and all specific delivery rules which accord with the delivery rule form are extracted from the knowledge graph of the school social network. Specifically, in this embodiment, the specific delivery rule extracted from the knowledge graph may be
Figure BDA0002947523010000105
Secondly, extracting the key information of a specific delivery rule to form a delivery rule candidate tc (re) ═ tcl,rem,ren)cWherein rel,rem,renE Re represents the entity relationship. Specifically, in the present embodiment, for the specific delivery rule, the specific delivery rule candidate that is configured by extracting the key information therein is tc1Becoming (classmate )cAnd tc2Becoming (with table, friend, classmate)c. All delivery rule candidates constitute a delivery rule candidate set Stc.
Third, for tc1Becoming (classmate )cIf e is1On the same table is2And e2Is e3It is obvious that e can be obtained1Is of the same school as e3It means that the same table entity relationship and the same class entity relationship are passed through out the same school entity relationship, hence tc1Is a correct delivery rule candidate. But for tc2Becoming (with table, friend, classmate)cIf e is4On the same table is5And e5Is e6Obviously, e is not necessarily obtained4The classmates are e6And therefore tc2Is an erroneous delivery rule candidate. Each delivery rule candidate needs to be filtered to determine its correctness. Filtering each transmission rule candidate tc by using a transmission rule candidate filtering algorithm to obtain a transmission rule correct candidate trc, where trc is (re)l,rem,ren)rWherein re1,rem,renE Re represents the entity relationship. All delivery rule correct candidates constitute a delivery rule correct candidate set Strc.
Wherein, the input of the transmission rule candidate filtering algorithm comprises an entity-entity ternary set Te, a transmission rule candidate set Stc and a threshold value tau1And τ2The output is the delivery rule correct candidate set Strc. In the algorithm, the number of simultaneous occurrences of the first two entity relationships in each transfer rule candidate is first obtained by using a Function1(), for example, for a specific transfer rule candidate tc1Becoming (classmate )cFirst, a count n is initialized to 0 if two entity-entity triplets (e) occur simultaneously in the knowledge-graph of the school social network1Same table, e2)eAnd (e)2In class, e3)eThen the count n is incremented by 1 and the step is repeated until the situation is found where all of the first two entity relationships in the delivery rule candidate occur and the count n is updated, and finally the Function1() returns the count n. Secondly, the number of the transmission rule candidates is used as the score of one transmission rule candidate, and the ratio of the number of the transmission rule candidates to the number of the two previous entity relations occurring simultaneously is used as the score of one transmission rule candidateThe step is repeated to calculate the score of each transfer rule candidate. Finally, using a threshold τ1And threshold τ2Each delivery rule candidate is filtered. First, the number of candidate occurrences is determined using each delivery rule and a threshold τ1And if the previous value is larger, continuing the subsequent threshold comparison. Secondly, the score of each transfer rule candidate is compared with a threshold value tau2In contrast, if the previous value is larger, the piece of delivery rule candidate is added to the delivery rule correct candidate set Strc. And a threshold τ1The purpose of comparison is to confirm that the number of candidates for a certain delivery rule satisfies a certain condition, and the threshold value tau2The purpose of the comparison is to confirm the correctness of a certain delivery rule candidate. In this way, each delivery rule candidate is filtered by using the delivery rule candidate filtering algorithm, so as to obtain a correct delivery rule candidate set Strc.
In this embodiment, the threshold value may be set according to actual conditions.
Specifically, in the present embodiment, tc is a candidate according to a specific delivery rule1And tc2After algorithm filtering, a correct candidate tc of the transfer rule can be obtained1Becoming (classmate )c。tc2Are screened out because they are incorrect.
And fourthly, obtaining a transfer rule completion entity-entity triple tet by utilizing two known entity-entity triples in a knowledge graph of the social network of the school and the corresponding transfer rule correct candidate trc. Specifically, in this embodiment, two known entity-entity triplets (Zhang three, same table, Wang five) in the knowledge-graph of the school social network are usedeAnd (king five, same class, small red)eAnd delivery of the rule correct candidate trc1Becoming (classmate )rTo obtain a new entity-entity triplet (Zhang three, classmate, little red)e. All the transmission rule completion entity-entity triples form a transmission rule completion entity-entity triplet set Stet, Stet ═ tet1,tet2,...,tet|Stet|}。
And finally, adding all entity-entity triples in the transfer rule completion entity-entity triplet set Stet into the knowledge graph of the school social network, and completing the knowledge graph completion of the school social network based on the transfer rule.
And S22, completing the entity-entity triad set based on the antisymmetric rule. First, the antisymmetric rule is defined as
Figure BDA0002947523010000111
Figure BDA0002947523010000112
Wherein ei,ejE denotes the entity, rel,remAnd e.g. Re represents entity relationship, and all the specific antisymmetric rules which accord with the antisymmetric rule form are extracted from the knowledge graph of the school social network. In particular, in this embodiment, the specific antisymmetric rule extracted from the knowledge graph of the school social network may be
Figure BDA0002947523010000113
Figure BDA0002947523010000114
Secondly, extracting key information of a specific antisymmetric rule to form an antisymmetric rule candidate ac, wherein ac is (re)l,rem)cWherein rel,remAnd epsilon Re represents the entity relationship. Specifically, in the present embodiment, for the specific antisymmetric rule, the antisymmetric rule candidate formed by extracting the key information therein is ac1Becoming (teacher, student)cAnd ac2Becoming (friend, classmate)c. All antisymmetric rule candidates constitute an antisymmetric rule candidate set Sac.
Third, for any antisymmetric rule candidate, its correctness is also undetermined. For ac1Becoming (teacher, student)cIf e is1Teacher of (a) is e2It is obvious that e can be obtained2Is e1Represents an antisymmetry between teacher entity relationships and student entity relationships, and thereforeac1Is a correct antisymmetric rule candidate. But for ac2Becoming (friend, classmate)cIf e is3Is e4Friend of (e)4Is not necessarily e3Classmate of (1), thus ac2Is a wrong antisymmetric rule candidate, each antisymmetric rule candidate needs to be filtered to determine its correctness. Filtering each antisymmetric rule candidate ac by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate arc, wherein the arc is equal to (re)l,rem)rWherein rel,remE Re represents the entity relationship. All antisymmetric rule correct candidates constitute an antisymmetric rule correct candidate set, Sarc.
Wherein the inputs to the antisymmetric rule candidate filtering algorithm include an entity-entity triplet Te, an antisymmetric rule candidate Sac, and a threshold τ3And τ4The output is the anti-symmetric rule correct candidate set, Sarc. The algorithm first obtains the number of occurrences of each entity relationship in each antisymmetric rule candidate in the knowledge-graph using the Function2 (). As for a particular antisymmetric rule candidate ac1Becoming (teacher, student)cFirst, two counts n are set1And n2Respectively representing the number of two entity relations in one antisymmetric rule candidate appearing in the knowledge graph, and n is1And n2The initial setting is 0. If an entity-entity triple (e) appears in the knowledge-graph1Teacher, e2)eThen n will be counted1And adding 1. If an entity-entity triple (e) appears in the knowledge-graph3Student, e4)eThen n will be counted2Adding 1, repeating the steps until the number of all occurrences of two entity relations in the antisymmetric rule candidate is found, and updating the count n1And n2Finally Function2() returns the count n1And n2. Secondly, calculating the ratio of the number of occurrences of each anti-symmetric rule candidate to the number of occurrences of two entity relations in the anti-symmetric rule candidate, taking the two ratios as two scores of the anti-symmetric rule candidate, and repeatingThis step calculates two scores for each anti-symmetric rule candidate. Finally, using a threshold τ3And threshold τ4Each antisymmetric rule candidate is filtered. First, the number of candidate occurrences is determined using each anti-symmetric rule and a threshold τ3And comparing, and continuing the subsequent threshold comparison if the previous value is larger. Secondly, the threshold values tau are respectively set4Comparing the two scores of each anti-symmetric candidate if both scores are greater than the threshold τ4Then the antisymmetric rule candidate is added to the antisymmetric rule correct candidate set, sarco. And a threshold τ3The purpose of the comparison is to confirm that the number of antisymmetric rule candidates satisfies a certain condition, and the threshold value tau4The purpose of the comparison is to confirm the correctness of a certain antisymmetric rule candidate. In this way, each antisymmetric rule candidate is filtered by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate set Sarc.
Specifically, in the present embodiment, ac candidates are based on the antisymmetric rule1And ac2Filtering to obtain the correct candidate arc with antisymmetric rule1Becoming (teacher, student)r,ac2And is screened out because of being incorrect.
And fourthly, obtaining the antisymmetric rule completion entity-entity triple tea by utilizing the known entity-entity triple in the knowledge graph of the social network of the school and the corresponding antisymmetric rule correct candidate arc thereof. Specifically, in this embodiment, entity-entity triplets (Zhang three, teacher, Sunjie) known in the knowledge-graph of the school social network are usedeAnd anti-symmetric rule correct candidate arc1Becoming (teacher, student)rGet a new entity-entity triplet (Sun Jie, student, Zhang three)e. All the antisymmetric rule completion entity-entity triplets form an antisymmetric rule completion entity-entity triplet set Stea, Stea ═ tea1,tea2,...,tea|stea|}。
And finally, adding all entity-entity triples in the entity-entity triplet set Stea into the knowledge graph of the school social network by the aid of the antisymmetric rule completion to complete knowledge graph completion based on the antisymmetric rule.
And S23, completing the entity-entity triad set based on the entity association rule. First, an entity association rule is defined as
Figure BDA0002947523010000131
Wherein ei,ej,ek∈E∧ej≠ekRepresents an entity, rel,remAnd e.g. Re represents the entity relationship, and all concrete entity association rules which accord with the entity association rule form are extracted from the knowledge graph of the school social network. Specifically, in this embodiment, the specific entity association rule extracted from the knowledge graph of the school social network may be
Figure BDA0002947523010000132
Figure BDA0002947523010000133
Secondly, extracting key information of a specific entity association rule to form an entity association rule candidate rc (re)l,ej,rem,ek)crWherein e isj,ekE denotes the entity, rel,remE Re represents the entity relationship. Specifically, in the present embodiment, the entity association rule candidate formed by extracting the key information corresponding to the specific entity association rule is rc1Becoming two (teacher, Sun Jie, classmate, Zhao six)crAnd rc2Becoming (friend, Li Si, friend, Zheng Qi)cr. All entity association rule candidates constitute an entity association rule candidate set Src.
Third, for rc1Becoming two (teacher, Sun Jie, classmate, Zhao six)crIf e is1If the teacher is Sunjie, then e can be obtained1The classmate of Zhao Liu, because the teacher of Zhao Liu is Sun Jie, rc1Is a correct entity association rule candidate. But if rc is changed1In the direction of association of, i.e. rc1' (Tongxiao, Zhao Liu, teacher, Sun Jie)crThen it can be obtained if e2Classmate of Zhao Liu, then e2Is Sunjie, this is clearly incorrect because if e is2Classmate of Zhao Liu, then e2The teacher of (1) is not necessarily grandma, but may be other teachers as well. For rc2Becoming (friend, Li Si, friend, Zheng Qi)crThe association from either direction is erroneous. Therefore, for each entity association rule candidate, not only its correctness but also the direction of association needs to be confirmed. Filtering each entity association rule candidate rc by using an entity association rule candidate filtering algorithm to obtain an entity association rule correct candidate rrc (re ═l,ej,rem,ek)rrWherein e isj,ekE denotes the entity, rel,remE Re represents the entity relationship. All the correct candidates for the entity association rule constitute a correct candidate set Srrc for the entity association rule.
Wherein, the input of the entity association rule candidate filtering algorithm comprises an entity-entity ternary set Te, an entity association rule candidate set Src and a threshold tau5And τ6And outputting the correct candidate set Srrc of the entity association rule. In the algorithm, firstly, the Function3() is used to obtain the number of the entity relations in each entity association rule candidate appearing with each entity combination respectively and the number of the entity relations appearing with two entity combinations simultaneously. E.g. associating rule candidates rc for an entity1Becoming two (teacher, Sun Jie, classmate, Zhao six)crFirst, three counts n are set1、n2、n3Respectively representing the number of the first entity relation and the corresponding entity combination, the number of the second entity relation and the corresponding entity combination and the number of the two entity relations and the corresponding entity combination in the specific entity association rule candidates, and combining n1、n2、n3The initial setting is 0. If an entity-entity triple (e) appears in the knowledge-graph1Teacher, Sun Jie)eThen n will be counted1Plus 1, if an entity-entity triple (e) appears in the knowledge-graph2College of studentsZhao six)eThen n will be counted2Plus 1, if two entity-entity triplets (e) occur at the same time3Teacher, Sun Jie)eAnd (e)3Classmate, Zhao Liu)eThen n will be counted3And adding 1. Repeating the steps until the number of two entity relations respectively appearing with the corresponding entity combination and the number of two entity relations appearing with the corresponding entity combination in the entity association rule candidate are found, and updating the count n1、n2、n3Finally Function3() returns the count n1、n2、n3
Secondly, the number of occurrences of a certain entity association rule candidate is n3The number of the first entity relation in the candidate and the corresponding entity combination is n1The number of the second entity relation in the candidate and the corresponding entity combination is n2The ratio is calculated n3And n1Ratio of (a) to (b), n3And n2The ratio of (a) to (b).
And taking the two ratios as two scores of the entity association rule candidate, and repeating the step to calculate two scores of each entity association rule candidate. Finally, using a threshold τ5And threshold τ6And filtering each entity association rule candidate. First, the number of occurrences of each entity association rule candidate is related to a threshold τ5And if the previous value is larger, continuing the subsequent threshold comparison.
Secondly, the two scores of the entity association rule candidate are related to a threshold value tau6By contrast, if both scores are greater than the threshold τ6Then the subsequent comparison continues.
And comparing the two scores of the entity association rule candidate, and if the first score is smaller than the second score, exchanging the association direction of the entity association rule candidate by using a Swap () function, otherwise, not exchanging. As for entity association rule candidates rc1Becoming two (teacher, Sun Jie, classmate, Zhao six)crThe Swap () function exchanges the association direction, rc, of the entity association rule candidate1Switched to rc1' (Tongxiao, Zhao Liu, teacher, Sun Jie)cr. And finally, adding the entity association rule candidate into an entity association rule correct candidate set Srrc. And a threshold τ5The purpose of comparison is to confirm that the number of entity association rule candidates satisfies a certain condition, and the threshold τ6The purpose of comparing is to confirm the correctness of the entity association rule candidate, and the purpose of comparing the two scores of the entity association rule candidate is to judge the association direction. Thus, entity association rule candidates are filtered by using an entity association rule filtering algorithm, and a correct entity association rule candidate set Srrc is obtained.
Specifically, in the present embodiment, rc is candidate according to the entity association rule1And rc2Filtering to obtain the correct candidate rrc of the specific entity association rule1Becoming two (teacher, Sun Jie, classmate, Zhao six)rr,re2And is screened out because of being incorrect.
And fourthly, obtaining an entity association rule completion entity-entity triple by utilizing the known entity-entity triple in the knowledge graph and the corresponding entity association rule correct candidate rrc. Specifically, in this embodiment, entity-entity triplets (Zhang three, teacher, Sunjie) known in the knowledge-graph are usedeCorrect candidate rrc with entity association rule1Becoming two (teacher, Sun Jie, classmate, Zhao six)rrGet the new entity-entity triple (Zhang three, classmate, Zhao six)e. All entity association rules complement entity-entity triples to form an entity association rule complement entity-entity triplet set Ster, Ster being { ter }1,ter2,...,ter|Ster|}。
And finally, adding all entity-entity triples in the entity-entity triplet set Ster to the knowledge graph by the entity association rule completion, and completing the knowledge graph completion based on the entity association rule.
And S24, completing the entity-attribute triad set based on the attribute association rule. First, an attribute association rule is defined as
Figure BDA0002947523010000151
Wherein eiE denotes an entity, aj,ak∈A∧aj≠akRepresents an attribute, ral,ramAnd e.g. Ra represents the attribute relationship, and all specific attribute association rules which accord with the attribute association rule form are extracted from the knowledge graph of the school social relationship. Specifically, in this embodiment, the specific attribute association rule extracted from the knowledge graph of the social relationship of the school may be
Figure BDA0002947523010000154
Figure BDA0002947523010000155
Secondly, extracting key information of specific attribute association rule to form attribute association rule candidate atrc (ra)l,aj,ram,ak)carWherein a isj,ak∈A∧aj≠akRepresents an attribute, ral,ramE.g. Ra represents an attribute relationship. Specifically, in the present embodiment, the attribute association rule candidate that is configured by extracting key information therein in accordance with the above-described attribute association rule is atrc1Either because of the difference in score or rankcarAnd atrc2Good result (punishment, nothing)car. All the attribute association rule candidates constitute an attribute association rule candidate set Satrc.
Third, for atrc1Either because of the difference in score or rankcarIf e is1At the position where the frame is over-strutted, e1Must be poor, so atrc1Is a correct attribute association rule candidate. But if alter atrc1The direction of association of, i.e. atrc1' (score, difference, punishment, putting on shelf)carThen it can be obtained if e2If the result of (D) is poor, e2The part which is not always subjected to the frame is not subjected to the frame, and the part which is not subjected to the frame or is subjected to other parts. For atrc2Good result (punishment, nothing)carFrom whichever sideThe association is erroneous. Therefore, it is necessary to confirm not only the correctness but also the direction of association for any one attribute association rule candidate. Filtering each attribute association rule candidate atrc by using an attribute association rule candidate filtering algorithm to obtain an attribute association rule correct candidate arrc (ra) ((ra))l,aj,ram,ak)aarWherein a isj,ak∈A∧aj≠akRepresents an attribute, ra1,ramE.g. Ra represents an attribute relationship.
The input of the attribute association rule candidate filtering algorithm comprises an entity-attribute ternary set Ta, an attribute association rule candidate set Satrc and a threshold tau7And τ8The output of the algorithm is the correct candidate set of attribute association rules, Sarrc. In the algorithm, firstly, the Function4() is used to obtain the number of the two attribute relations in each attribute association rule candidate appearing with the combination of the attribute relations and the attribute relations appearing with the combination of the attribute relations. Such as associating a rule candidate, atrc, for a particular attribute1Either because of the difference in score or rankcarFirst, three counts n are set1、n2、n3Respectively representing the number of the combinations of two attribute relations and the respective attributes in the attribute association rule candidates and the number of the combinations of two attribute relations and the respective attributes appearing simultaneously, and dividing n1、n2、n3The initial setting is 0. If an entity-attribute triple (e) appears in the knowledge-graph1Punishment and support)aThen n will be counted1Plus 1, if an entity-attribute triple (e) appears in the knowledge-graph2Achievement, difference)aThen n will be counted2Plus 1, if two entity-attribute triplets (e) occur simultaneously3Punishment and support)aAnd (e)3Achievement, difference)aThen n will be counted3And adding 1. Repeating the steps until the number of the two attribute relations in the attribute association rule candidate and the combination of the two attribute relations and the respective attributes are foundThe number of epochs, and update the count n1、n2、n3Finally Function4() returns the count n1、n2、n3. Secondly, calculating the ratio of the number of the occurrences of each attribute association rule candidate to the number of the occurrences of the two attribute relations in the candidate and the respective attribute combination, taking the two ratios as two scores of the attribute association rule candidate, and repeating the step to calculate the two scores of each attribute association rule candidate. Finally, using a threshold τ7And threshold τ8And filtering each attribute association rule candidate. First, the number of occurrences of a rule candidate is associated with a threshold τ using an attribute7And comparing, and continuing the subsequent threshold comparison if the previous value is larger. Secondly, the ratio of the number of the candidate occurrences of the attribute association rule to the number of the two attribute relations and the respective attribute combination of the candidate occurrences at the same time is utilized to be equal to the threshold value tau8And comparing, and continuing to perform subsequent comparison if the previous value is larger.
And comparing the two scores of the attribute association rule candidate, and exchanging the association direction of the attribute association rule candidate by using a Swap () function if the first score is smaller than the second score, otherwise, not exchanging. Associating rule candidates for attributes1Either because of the difference in score or rankcarAfter exchange becomes atrc1' (score, difference, punishment, putting on shelf)car. Finally, the attribute association rule candidate is added to the correct attribute association rule candidate set Sarrc. And a threshold τ7The purpose of the comparison is to confirm that the number of candidates for a specific attribute association rule satisfies a certain condition, and the threshold τ8The purpose of the comparison is to confirm the correctness of a specific attribute association rule candidate, and the purpose of the comparison of the two scores of the attribute association rule is to determine the direction of association. In this way, the attribute association rule candidate is filtered by using the attribute association rule candidate filtering algorithm, so as to obtain the correct attribute association rule candidate set Sarrc.
Specifically, in the present embodiment, the rule candidate atrc is associated according to the attribute1And atrc2Can be obtained after filtrationAttribute association rule correct candidate arrc1Either because of the difference in score or rankarr,atrc2And is screened out because of being incorrect. All correct candidates for the attribute association rule constitute a correct candidate set of attribute association rules, Sarrc.
And fourthly, obtaining attribute association rules to complement the entity-entity triple tar by utilizing the known entity-entity triple in the knowledge graph and the attribute association rule correct candidate arrc corresponding to the entity-entity triple. Specifically, in this embodiment, the entity-attribute triple (Zhang three, Branch, put frame) a and the attribute association rule correct candidate arrc known in the knowledge graph are used1Either because of the difference in score or rankarrGet the new entity-attribute triple (Zhang three, achievement, poor)a. All attribute association rule completion entity-entity triples form an attribute association rule completion entity-entity triplet set Star, Star ═ tar1,tar2,...,tar|Star|}。
And finally, adding all entity-entity triples in the attribute association rule completion entity-entity triplet set Star into the knowledge graph to complete the knowledge graph completion based on the attribute association rule.
After completion of the rule-based knowledge graph as shown in fig. 1, a first completed school social network knowledge graph can be obtained, in fig. 1, the completed entity-entity triples are represented by thick solid lines, and the completed entity-attribute triples are represented by thick dotted lines. The three friends of the third Zhang can be analyzed through the existing entity-entity triples and entity-attribute triples, and the complemented entity-entity triples and entity-attribute triples, namely the four Blasdri six friend of the third Zhang, the three Zhang and the six Zhao are classmates, the three Zhang is placed at a place where the third Zhang is placed, the possibility that the three Zhang has the Blashao six can be inferred through the relationship between the characters of the three Zhang and the six Zhao and the attributes of the characters of the three Zhang, and whether the behavior of the Zhao six really exists or not can be investigated according to the inference.
And S3, as shown in figure 2, based on the knowledge graph completed for the first time, initializing the defined relation triples in the knowledge graph into embedded vectors which are easy to process by a machine, then performing model training on the embedded vectors, and finally performing a knowledge graph completion task based on the embedded knowledge graph by using the trained embedded vectors to obtain a predicted new triplet group to perform second completion on the knowledge graph. The specific process is as follows:
and S31, converting the entity, the attribute, the entity relation and the attribute relation in the knowledge graph into an embedded vector, and defining a score function and a loss function on the entity-entity triple and the entity-attribute triple based on a TransE model. Specifically, the entity-embedded vector set is defined as E ═ E1,e2,...,e|E|In which e1,e2,...,e|E|Representing an entity e1,e2,...,e|E|Embedding the corresponding entity into a vector; the attribute embedded vector set is defined as a ═ a1,a2,...,a|A|In which a is1,a2,...,a|A|Represents an attribute a1,a2,...,a|A|Embedding the corresponding attribute into a vector; the entity relationship embedding vector set is defined as Re ═ Re1,re2,...,re|Re|Where re1,re2,...,re|Re|Representing entity relationships re1,re2,...,re|Re|Embedding the corresponding entity relationship into a vector; the attribute relation embedding vector set is defined as Ra ═ Ra1,ra2,...,ra|Ra|Where ra1,ra2,...,ra|Ra|Representing the attribute relationship ra1,ra2,...,ra|Ra|And embedding the corresponding attribute relation into the vector.
In the embodiment, two types of triples, namely, an entity-entity triple and an entity-attribute triple, are defined, and therefore, a scoring function for respectively judging the correctness of the two types of triples is required. The scoring function on an entity-entity triple is defined as
Figure BDA0002947523010000181
Figure BDA0002947523010000182
Wherein ei,ejE denotes the entity embedding vector, rele.Re represents an entity relationship embedding vector, L1|L2Representing either a 1-norm or a 2-norm. Embedding a vector e in a formula by an entityiAdding entity relationship embedding vector relThen subtract the entity embedding vector ejAnd calculating the value of 1 norm or 2 norm to measure the correctness of the entity-entity triple. The smaller the value calculated by the score function, the smaller eiPlus relAnd ejThe closer the distance between the entities, the more likely the entity-entity triplet exists, and vice versa. The scoring function on an entity-attribute triple is defined as
Figure BDA0002947523010000183
Wherein eiE denotes the entity embedding vector, aje.A represents the attribute embedding vector, ralE Ra represents an attribute relationship embedding vector, L1|L2Representing either a 1-norm or a 2-norm. Embedding a vector e in a formula by an entityiEmbedding vector ra by adding attribute relationlThen subtract the attribute embedding vector ajAnd calculating a value obtained by 1 norm or 2 norm to measure the correctness of the entity-attribute triple, wherein the smaller the value is, the higher the possibility of existence of the entity-attribute triple is, and the smaller the value is, otherwise, the smaller the possibility is.
The TransE model trains the embedded vectors by minimizing an edge-based penalty. The loss function on an entity-entity triple is defined as
Figure BDA0002947523010000191
Figure BDA0002947523010000192
Wherein (e)i,rel,ej)eEntity-entity triplets, e.g., Tei′,rel,ej′)eE Te' represents random substitution (e)i,re1,ej)eE in (a)iOr e isjThe resulting entity-entity triplet, γ, is an edge hyper-parameter, [ X [ ]]+This indicates that if X is a number equal to or greater than 0, the result is that number, otherwise the result is 0. The objective of the minimization of the loss function is to make the scores of correct entity-entity triples as small as possible, while the scores of incorrect entity-entity triples are as large as possible, so that the correctness of an entity-entity triplet can be measured correctly. The loss function on an entity-attribute triple is defined as
Figure BDA0002947523010000193
Wherein (e)i,ral,aj)aE.Ta represents an entity-attribute triple, (e)i′,ral,aj′)aEpsilon Ta' denotes random substitution (e)i,ral,aj)aE in (a)iOr ajThe resulting entity-attribute triplet, γ, is an edge hyper-parameter, [ X ]]+This indicates that if X is a number equal to or greater than 0, the result is that number, otherwise the result is 0. The objective of minimizing the loss function is to make the score of the correct entity-attribute triple as small as possible, and the score of the incorrect entity-attribute triple as large as possible, so that the correctness of an entity-attribute triple can be correctly measured.
And S32, performing embedded model training on the entity-entity triple by using a TransE model training algorithm based on the defined score function and the defined loss function to obtain a trained embedded vector.
Specifically, the TransE model training algorithm on the entity-entity triples randomly initializes entity relationship embedded vectors to vectors with a dimension k by utilizing uniform distribution, and then standardizes each entity relationship embedded vector. Second, the entity embedding vector is randomly initialized to a vector of dimension k using uniform distribution. And thirdly, standardizing each entity embedding vector, randomly sampling the entity-entity triples with the number b from the entity-entity triplet set, then carrying out negative sampling on the sampled entity-entity triples, namely randomly replacing the head entity or the tail entity of the entity-entity triples as other entities to form negative sampling entity-entity triples, and then adding the corresponding entity-entity triples and the corresponding negative sampling entity-entity triplet combinations into the set T. Finally, loss calculation is performed on each combination in the set T, and parameters are updated.
The parameters are updated by gradient descent of the losses, e.g. for entity-entity triplets (e)1,re1,e2)eAnd its corresponding negative sampling entity-entity triplet (e)1′,re1,e2′)eThe gradient descent is represented by
Figure BDA0002947523010000201
Wherein theta isiDenoted is a parameter, α denotes a learning rate, and J () denotes a loss function. The gradient decrease of TransE, which can be obtained according to the gradient decrease formula, is shown in the following five formulas:
Figure BDA0002947523010000202
Figure BDA0002947523010000203
Figure BDA0002947523010000204
Figure BDA0002947523010000205
Figure BDA0002947523010000206
take formula (1) as an example, simplify Le(e1,re1,e2,e1′,e2') the following equation (6) results from the sigma summation:
Le(e1,re1,e2,e1′,e2′)=[fe(e1+re1,e2)-fe(e1′+re1,e2′)+γ]+ (6)
combining equation (1) and equation (6) yields the following equation (7):
Figure BDA0002947523010000207
wherein if fe(e1+re1,e2)-fe(e1′+re1,e2') + gamma is less than or equal to 0, the parameter is not updated. If the calculated value is greater than 0, then the partial derivative to e1 is calculated, for example, calculating a 1-norm, which ultimately results in a similarity [1, 1, 1, -1, ·]And finally updating the parameters by using the formula (1). α in the formulas (1) to (5) represents a learning rate, and the purpose of multiplying the learning rate in the formulas is to prevent zigzag oscillation from being formed, resulting in non-convergence. The updating of each parameter of each round is carried out in such a way that the value calculated by the scoring function of the correct entity-entity triplet is as small as possible and the value calculated by the scoring function of the wrong entity-entity triplet is as large as possible in the updating process, so that the overall loss tends to 0. Updating process of other parameters and e1The update process of (2) is similar.
And S33, performing embedded model training on the entity-attribute triples by using a TransE model training algorithm based on the defined score function and loss function to obtain the trained embedded vectors, wherein the implementation method is the same as S32. After the whole model training is finished, an entity embedded vector set, an entity relationship embedded vector set, an attribute embedded vector set and an attribute relationship embedded vector set can be obtained.
And S34, performing entity prediction, relationship prediction and attribute prediction on the knowledge graph based on the trained entity embedded vector set, attribute embedded vector set, entity relationship embedded vector set and attribute relationship embedded vector set to perform second completion. The specific process is as follows:
and S341, complementing the knowledge graph by using an entity prediction algorithm. Specifically, the input of the entity prediction algorithm includes flag, head entity embedding vector eiOr tail entity embedding vector ejOr attribute embedding vector akEntity relationship embedded vector relOr attribute relation embedding vector ramEntity embedded vector set E. The output includes a triplet embedded vector candidate set C and a triplet embedded vector candidate set S. The entity prediction algorithm respectively carries out head entity prediction of the entity-entity triples, tail entity prediction of the entity-entity triples and head entity prediction of the entity-attribute triples.
For head entity prediction on entity-entity triplets, each entity embedding vector is associated with a given tail entity embedding vector ejAnd entity relation embedding vector relAnd combining and calculating the score of each combination by using a score function on the entity-entity triples.
For tail entity prediction on entity-entity triplets, each entity embedding vector is associated with a given head entity embedding vector eiAnd entity relation embedding vector relAnd combining and calculating the score of each combination by using a score function on the entity-entity triples.
For head entity prediction on entity-attribute triplets, each entity embedding vector is associated with a given attribute embedding vector akAnd attribute relation embedding vector ramAnd combining and calculating the score of each combination by using a score function on the entity-attribute triples.
And (3) sort () functions in the entity prediction algorithm are arranged according to the scores of the triples in an ascending order, and the higher the ranking of the triples is, the higher the possibility that the triples are correct is, so that the triples ranked earlier in the triples scores are extracted and converted into entity-entity triples or entity-attribute triples. And finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete the completion of the knowledge graph by utilizing entity prediction.
And S342, performing knowledge graph completion by using an attribute prediction algorithm. Specifically, the input to the attribute prediction algorithm includes the head entity embedding vector eiAttribute relation embedding vector ramAnd embedding the attributes into the vector set A. The output includes entity-attribute triplet embedding vector candidate set C and entity-attribute triplet embedding vector candidate set S. Attribute prediction Each attribute embedding vector with a given head entity embedding vector eiAnd attribute relation embedding vector ramAnd combining and calculating the score of each combination by using a score function on the entity-attribute triples.
The sort () function in the attribute prediction algorithm is arranged according to the scores of the entity-attribute triples in an ascending order, and the probability that the triples in the scores of the entity-attribute triples which are ranked more forward are more likely to be established, so that the triples in the scores of the entity-attribute triples which are ranked more forward are extracted and converted into the entity-attribute triples. And finally, adding the newly obtained entity-attribute triple into the knowledge graph to complete the completion of the knowledge graph by utilizing attribute prediction.
And S343, completing the knowledge graph by using a relation prediction algorithm. Specifically, the inputs of the relational prediction algorithm include flag, header entity embedding vector eiTail entity embedded vector ejOr attribute embedding vector akEntity relationship embedded vector relOr attribute relation embedding vector ramEntity relation embedding vector set Re or attribute relation embedding vector set Ra. The output includes a triplet embedded vector candidate set C and a triplet embedded vector candidate set S. The relationship prediction algorithm will perform entity relationship prediction and attribute relationship prediction, respectively.
For entity relationship prediction, each entity relationship embedding vector is associated with a given head entity embedding vector eiAnd tail entity embedding vector ejAnd combining and calculating the score of each combination by using a score function on the entity-entity triples.
For attribute relationship prediction, each attribute relationship is embedded intoQuantity and given head entity embedding vector eiAnd attribute embedding vector akAnd combining and calculating the score of each combination by using a score function on the entity-attribute triples.
The sort () function in the relational prediction algorithm is arranged according to the scores of the triples in an ascending order, and the probability that the triples with the higher order in the triples scores are established is higher, so that the triples with the higher order in the triples scores are extracted and converted into entity-entity triples or entity-attribute triples. And finally, adding the newly obtained entity-entity triples and entity-attribute triples into the knowledge graph to complete knowledge graph completion by utilizing relationship prediction.
The method carries out two times of completion work on the knowledge graph based on the rule and the embedded mode, so that the information in the knowledge graph is richer and more accurate.
As shown in fig. 3, a second embedded-based school social network knowledge graph completion is performed on the first rule-based completed school social network knowledge graph, where the bold circle represents the predicted leading or trailing entity, the bold solid line represents the predicted entity relationship, and the bold dashed line represents the predicted attribute relationship. For entity-entity triplets, head entity prediction is performed such as entity-entity triplets (minired, friend, Xiaoming)eThe head entity in the middle is "little red", and the tail entity predicts as entity-entity triple (Sun Jie, student, king five)eThe "Wangwu" entity in the middle, entity relationship prediction such as entity-entity triple (Zhengqi, classmate, Zhao Liu)eThe entity relationship in (1) 'classmate'; for entity-attribute triplets, head entity prediction is performed, such as entity-attribute triplets (Xiaoming, punishment, fighting)aThe head entity in (1) is Xiaoming, and the attribute prediction is like entity-attribute triple (Li Si, punishment, fighting)aIn (3) attribute "fighting", prediction of attribute relationship is as entity-attribute triplets (Zhengqi, punishment, fighting)aThe attribute relationship in (1) 'pun'. The existing entity-entity triples and entity-attribute triples and the complemented entity-entity triples and entity-attribute triples of Zhengqi, Li Si Ba Zhao Ling Liu, Zhengqi, and ZhengqiZhengqi and ZhaoLiu are classmates, while Zhengqi is hit by the punting place, and the possibility of Zhengqi existing in ZhaoLiu can be inferred through the relationship between Zhengqi and ZhaoLiu and the attributes of Zhengqi.
By complementing the knowledge graph of the social network of the school in the embodiment through a rule-based and embedded knowledge graph complementing method, the possibility that Zhang III, Li IV and Zhengqi may have Zhang Ling Zhao Liu can be analyzed, and through the presumption, the fact that whether three people really have Zhang Ling Zhao Liu can be further investigated, so that corresponding measures are taken to avoid the deterioration of the Zhang Ling situation.
The above-described calculation examples of the present invention are merely to explain the calculation model and the calculation flow of the present invention in detail, and are not intended to limit the embodiments of the present invention. It will be apparent to those skilled in the art that other variations and modifications of the present invention can be made based on the above description, and it is not intended to be exhaustive or to limit the invention to the precise form disclosed, and all such modifications and variations are possible and contemplated as falling within the scope of the invention.

Claims (10)

1. A rule and embedded knowledge graph-based completion method is characterized by specifically comprising the following steps:
step S1, defining relation triples in the knowledge graph, wherein the relation triples comprise entity-entity triples and entity-attribute triples;
s2, complementing the entity-entity triples by using the transfer rule, the anti-symmetric rule and the entity association rule, and complementing the entity-attribute triples by using the attribute association rule to obtain a knowledge graph subjected to first complementing;
step S3, converting the entity, attribute, entity relationship and attribute relationship in the knowledge graph after the first completion into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector and an attribute relationship embedding vector respectively;
taking the entity embedding vector, the attribute embedding vector, the entity relationship embedding vector and the attribute relationship embedding vector as the input of a TransE model, defining a score function on an entity-entity triple, a loss function on the entity-entity triple, a score function on the entity-attribute triple and a loss function on the entity-attribute triple, and training the entity-entity triple and the entity-attribute triple by using a TransE model training algorithm to obtain the trained entity embedding vector, attribute embedding vector, entity relationship embedding vector and attribute relationship embedding vector;
and completing the completion of the knowledge graph after performing secondary completion on the knowledge graph by using an entity prediction algorithm, an attribute prediction algorithm and a relation prediction algorithm.
2. The method for supplementing knowledge-graph based on rules and embedding according to claim 1, wherein the specific process of step S1 is as follows:
s11, all entities in the knowledge graph form an entity set, all attributes form an attribute set, all entity relations form an entity relation set, and all attribute relations form an attribute relation set;
s12, defining an entity-entity triple based on the entity in the entity set and the entity relationship in the entity relationship set;
and S13, defining the entity-attribute triple based on the entity in the entity set, the attribute in the attribute set and the attribute relationship in the attribute relationship set.
3. The rule-and-embedding-based knowledge graph completion method according to claim 2, wherein the completion of the entity-entity triplet is performed by using the delivery rule according to the following specific process:
s211, the transmission rule is defined as
Figure FDA0002947522000000011
Wherein e isi,ej,ekRepresenting an entity, ei,ej,ekE, E represents the set of entities, rel,rem,renRepresenting entity relationships, rel,rem,renE.g. Re, Re represents the set of entity relationships, (. DEG)eRepresenting entity-entity triples, and extracting all specific transfer rules which accord with the form of the transfer rules from the knowledge graph;
s212, extracting the key information of the specific delivery rule to form a delivery rule candidate tc (re)1,rem,ren)cAll the transfer rule candidates constitute a transfer rule candidate set Stc;
s213, filtering each transmission rule candidate tc by using the transmission rule candidate filtering algorithm to obtain a transmission rule correct candidate trc (re ═l,rem,ren)rAll the correct candidates of the transfer rule form a correct candidate set Strc of the transfer rule;
s214, obtaining a transfer rule completion entity-entity triple tet by using two known entity-entity triples in the knowledge graph and the corresponding transfer rule correct candidate trc, wherein the transfer rule completion entity-entity triples form a transfer rule completion entity-entity triple set Stet, Stet { [ tet ]1,tet2,...,tet|Stet|},tet1Representing that the 1 st transmission rule in the three-tuple set Stet completes the entity-entity triplets, | Stet | is the number of the transmission rule contained in the three-tuple set Stet to complete the entity-entity triplets;
s215, adding all entity-entity triples in the transfer rule completion entity-entity triplet set Stet into the knowledge graph, and completing the knowledge graph completion based on the transfer rule.
4. The rule-based and embedded knowledge-graph completion method according to claim 3, wherein the completion of the entity-entity triples by using the antisymmetric rule is carried out by the following specific processes:
s221, the antisymmetric rule is defined as
Figure FDA0002947522000000021
Wherein e isi,ejRepresenting an entity, ei,ejE, E represents the set of entities, rel,remRepresenting entity relationships, re1,remThe element belongs to Re, the Re represents an entity relation set, and all specific antisymmetric rules which accord with the antisymmetric rule form are extracted from the knowledge graph;
s222, extracting key information of specific antisymmetric rules to form antisymmetric rule candidate ac, wherein ac is (re)1,rem)cAll antisymmetric rule candidates constitute an antisymmetric rule candidate set Sac;
s223, filtering each antisymmetric rule candidate ac by using an antisymmetric rule candidate filtering algorithm to obtain an antisymmetric rule correct candidate arc, wherein the arc is (re)l,rem)rAll antisymmetric rule correct candidates constitute an antisymmetric rule correct candidate set Sarc;
s224, obtaining an antisymmetric rule completion entity-entity triplet tea by using the known entity-entity triplets in the knowledge graph and the corresponding antisymmetric rule correct candidate arc, where all antisymmetric rule completion entity-entity triplets form an antisymmetric rule completion entity-entity triplet Stea, and Stea { (tea) }1,tea2,...,tea|Stea|},tea1Representing the 1 st antisymmetric rule completion entity-entity triple in the three-tuple set Stea, wherein | Stea | is the number of antisymmetric rule completion entity-entity triples contained in the three-tuple set Stea;
s225, adding all entity-entity triples in the entity-entity triplet set Stea to the knowledge graph by the aid of the anti-symmetric rules, and completing knowledge graph completion based on the anti-symmetric rules.
5. The rule-and-embedded-based knowledge graph completion method according to claim 4, wherein the completion of the entity-entity triplet is performed by using the entity association rule, and the specific process is as follows:
s231, defining the entity association rule as
Figure FDA0002947522000000031
Wherein e isi,ej,ekE and Ej≠ek,ei,ej,ekRepresenting an entity, E representing a set of entities, rel,rem∈Re,rel,remRepresenting entity relations, Re representing an entity relation set, and extracting all concrete entity association rules which accord with the entity association rule form from the knowledge graph;
s232, extracting key information of the specific entity association rule to form an entity association rule candidate rc (re ═ re)l,ej,rem,ek)crAll entity association rule candidates form an entity association rule candidate set Src;
s233, filtering each entity association rule candidate rc by using the entity association rule candidate filtering algorithm to obtain an entity association rule correct candidate rrc (re ═ rrc)l,ej,rem,ek)rrAll the correct entity association rule candidates form an entity association rule correct candidate set Srrc;
s234, obtaining an entity association rule completion entity-entity triple ter by using the known entity-entity triple in the knowledge graph and the entity association rule correct candidate rrc corresponding to the entity association rule triple, wherein the entity association rule completion entity-entity triple constitutes an entity association rule completion entity-entity triple set Ster which is { ter }1,ter2,...,ter|Ster|},ter1Representing that the 1 st entity association rule in the three-tuple set Ster completes the entity-entity triples, | Ster | is the number of the entity association rule completion entity-entity triples contained in the three-tuple set Ster;
s235, adding all entity-entity triples in the entity-entity triple set Ster to the knowledge graph according to the entity association rule completion, and completing the knowledge graph completion based on the entity association rule.
6. The rule and embedded-based knowledge graph completion method according to claim 5, wherein the completion of the entity-attribute triples by using the attribute association rule comprises the following specific processes:
s241, defining the attribute association rule as
Figure FDA0002947522000000032
Wherein ei∈E,eiRepresenting an entity, E representing a set of entities, aj,akIs epsilon of A and aj≠ak,aj,akRepresenting attributes, A representing a set of attributes, ral,ram∈Ra,ral,ramRepresents the attribute relationship, Ra represents the set of attribute relationships, (-)aRepresenting entity-attribute triples, and extracting all specific attribute association rules which accord with the attribute association rule form from the knowledge graph;
s242, extracting key information of the specific attribute association rule to form an attribute association rule candidate atrc (ra)1,aj,ram,ak)carAll the attribute association rule candidates form an attribute association rule candidate set Satrc;
s243, filtering each attribute association rule candidate atrc by using the attribute association rule candidate filtering algorithm to obtain an attribute association rule correct candidate arrc, wherein arrc is (ra)l,aj,Fam,ak)aarAll the correct attribute association rule candidates form an attribute association rule correct candidate set Sarrc;
s244, obtaining attribute association rule completion entity-attribute triple tar by using the known entity-attribute triple in the knowledge graph and the attribute association rule correct candidate arrc corresponding to the known entity-attribute triple, wherein all the attribute association rule completion entity-attribute triples form an attribute association rule completion entity-attribute triple set Star, and Star is { tar }1,tar2,...,tar|Star|},tar1Representing that the 1 st attribute association rule in the triplet set Star completes the entity-attribute triples, | Star | is the number of attribute association rule completion entity-attribute triples contained in the triplet set Star;
s245, adding all entity-attribute triples in the attribute association rule completion entity-attribute triplet set Star into the knowledge graph, and completing the completion of the knowledge graph based on the attribute association rule.
7. The method according to claim 6, wherein in step S3, the entity, the attribute, the entity relationship and the attribute relationship in the first completed knowledge graph are respectively converted into an entity embedding vector, an attribute embedding vector, an entity relationship embedding vector and an attribute relationship embedding vector, and the specific process is as follows:
randomly initializing the entity into a vector with a dimension of k by utilizing uniform distribution, and then carrying out standardization processing on the vector obtained by random initialization to obtain an entity embedded vector;
and similarly, obtaining an attribute embedded vector, an entity relationship embedded vector and an attribute relationship embedded vector.
8. The rule-based and embedded knowledge-graph completion method of claim 7, wherein the definition of the scoring function on the entity-entity triplets is:
Figure FDA0002947522000000041
wherein,
Figure FDA0002947522000000042
representing entity-entity triplets (e)i,rel,ej)eThe score function of (a) above (b),
Figure FDA0002947522000000043
respectively represent entities ei,ejThe corresponding entity is embedded in the vector and,
Figure FDA0002947522000000044
representing entity relationships relThe corresponding entity relationship is embedded into the vector,
Figure FDA0002947522000000045
represents a1 norm or a2 norm;
the definition of the loss function on an entity-entity triplet is:
Figure FDA0002947522000000051
wherein L iseRepresenting entity-entity triplets (e)i,rel,ej)eA loss function of (a) wherein (e)i,rel,ej)eE Te, Te denotes the set of entity-entity triples, (e)i′,rel,ej′)eIndicating random substitution (e)i,rel,ej)eE in (a)iOr e isjThe resulting replaced entity-entity triplet, (e)i′,rel,ej′)eE Te ', Te' represents the set of entity-entity triples after replacement,
Figure FDA0002947522000000052
denotes ei' the corresponding entity embeds the vector(s),
Figure FDA0002947522000000053
denotes ej' corresponding entity embedding vector, γ is an edge hyper-parameter, [ X ]]+Indicates that if X is a number of 0 or more, [ X ]]+Is X, otherwise [ X]+Is 0;
the definition of the scoring function on the entity-attribute triplets and the loss function on the entity-attribute triplets are the same.
9. The method according to claim 8, wherein the entity-entity triples are trained using a TransE model training algorithm, which comprises:
randomly sampling b entity-entity triples from the entity-entity triple set of the knowledge graph after the first completion, then carrying out negative sampling on the entity-entity triples obtained by sampling, namely randomly replacing head entities in the entity-entity triples with other entities or tail entities with other entities, forming negative sampling entity-entity triples by the replaced entity-entity triples, and then adding the entity-entity triples and the negative sampling entity-entity triple combinations corresponding to the entity-entity triples into the set T;
calculating loss function values of each combination in the set T in sequence, and updating parameters of the TransE model by adopting a gradient descent method after calculating the loss function value of each combination;
the method of training entity-attribute triples is the same as above.
10. The rule and embedding based knowledge graph completion method according to claim 9, wherein the knowledge graph is completed for the second time by using an entity prediction algorithm, an attribute prediction algorithm and a relationship prediction algorithm, and the specific process is as follows:
s331, complementing the knowledge graph by using an entity prediction algorithm
The entity prediction algorithm respectively carries out head entity prediction of the entity-entity triple, tail entity prediction of the entity-entity triple and head entity prediction of the entity-attribute triple;
for head entity prediction on entity-entity triplets, each trained entity embedding vector is associated with a given trained tail entity embedding vector
Figure FDA0002947522000000054
And entity relationship embedding vector
Figure FDA0002947522000000055
Combining, and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity embedding vectorEmbedding vectors of quantities with given entity relationships
Figure FDA0002947522000000056
And tail entity embedding vector
Figure FDA0002947522000000057
Embedding the composed entity-entity triple into the vector candidate;
for tail entity prediction on entity-entity triplets, each trained entity embedding vector is associated with a given trained head entity embedding vector
Figure FDA0002947522000000061
And entity relationship embedding vector
Figure FDA0002947522000000062
Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity embedding vectors with the given entity relationship embedding vectors
Figure FDA0002947522000000063
And head entity embedding vector
Figure FDA0002947522000000064
Embedding the composed entity-entity triple into a vector candidate;
for head entity prediction on entity-attribute triplets, each trained entity embedding vector is associated with a given trained attribute embedding vector
Figure FDA0002947522000000065
Sum attribute relationship embedded vector
Figure FDA0002947522000000066
Combining, and calculating scores for each combination using a scoring function on entity-attribute triplets, and then storing the traversed entitiesEmbedding vector with given attribute relation
Figure FDA0002947522000000067
And attribute embedded vector
Figure FDA0002947522000000068
Embedding the composed entity-attribute triple into a vector candidate;
the sort () function in the entity prediction algorithm arranges the scores of the triples in ascending order, adds the first N triples in the order to the knowledge graph, and completes the completion of the knowledge graph by using the entity prediction algorithm;
s332, performing knowledge graph completion by using attribute prediction algorithm
An attribute prediction algorithm combines each trained attribute embedding vector with a given trained head entity embedding vector
Figure FDA0002947522000000069
And attribute relation embedding vector
Figure FDA00029475220000000610
Combining, calculating the score of each combination by using a score function on the entity-attribute triple, and storing the traversed attribute embedding vector and the given attribute relation embedding vector
Figure FDA00029475220000000611
And head entity embedding vector
Figure FDA00029475220000000612
Embedding the composed entity-attribute triples into vector candidates;
sort () function in the attribute prediction algorithm arranges the scores of the entity-attribute triples in ascending order, adds the first N triples in the order to the knowledge graph, and completes the completion of the knowledge graph by using the attribute prediction algorithm;
s333, complementing the knowledge graph by using a relation prediction algorithm
For entity relationship prediction, each trained entity relationship embedding vector is associated with a given trained head entity embedding vector
Figure FDA00029475220000000613
And tail entity embedding vector
Figure FDA00029475220000000614
Combining and calculating a score for each combination using a scoring function on the entity-entity triplets, and then storing the traversed entity-relationship embedding vector with the given head embedding vector
Figure FDA00029475220000000615
And tail entity embedding vector
Figure FDA00029475220000000616
Embedding the composed entity-entity triple into the vector candidate;
for attribute relationship prediction, each trained attribute relationship embedding vector is associated with a given trained head entity embedding vector
Figure FDA00029475220000000617
And attribute embedded vector
Figure FDA00029475220000000618
Combining and calculating a score for each combination using a scoring function on the entity-attribute triplets, and then storing the traversed attribute relationship embedding vector with the given head entity embedding vector
Figure FDA00029475220000000619
And attribute embedded vector
Figure FDA00029475220000000620
Embedding the composed entity-attribute triples into vector candidates;
and (3) arranging the scores of the triples in an ascending order by a sort () function in the relational prediction algorithm, adding the first N triples in the order to the knowledge graph, and completing the completion of the knowledge graph by using the relational prediction algorithm.
CN202110197370.9A 2021-02-22 2021-02-22 Knowledge graph completion method based on rules and embedding Pending CN112818134A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110197370.9A CN112818134A (en) 2021-02-22 2021-02-22 Knowledge graph completion method based on rules and embedding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110197370.9A CN112818134A (en) 2021-02-22 2021-02-22 Knowledge graph completion method based on rules and embedding

Publications (1)

Publication Number Publication Date
CN112818134A true CN112818134A (en) 2021-05-18

Family

ID=75864625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110197370.9A Pending CN112818134A (en) 2021-02-22 2021-02-22 Knowledge graph completion method based on rules and embedding

Country Status (1)

Country Link
CN (1) CN112818134A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113672741A (en) * 2021-08-19 2021-11-19 支付宝(杭州)信息技术有限公司 Information processing method, device and equipment
CN113722611A (en) * 2021-08-23 2021-11-30 讯飞智元信息科技有限公司 Method, device and equipment for recommending government affair service and computer readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113672741A (en) * 2021-08-19 2021-11-19 支付宝(杭州)信息技术有限公司 Information processing method, device and equipment
CN113672741B (en) * 2021-08-19 2024-06-21 支付宝(杭州)信息技术有限公司 Information processing method, device and equipment
CN113722611A (en) * 2021-08-23 2021-11-30 讯飞智元信息科技有限公司 Method, device and equipment for recommending government affair service and computer readable storage medium
CN113722611B (en) * 2021-08-23 2024-03-08 讯飞智元信息科技有限公司 Recommendation method, device and equipment for government affair service and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN108509654B (en) Construction method of dynamic knowledge graph
CN112818134A (en) Knowledge graph completion method based on rules and embedding
CN109327480B (en) Multi-step attack scene mining method
CN112232925A (en) Method for carrying out personalized recommendation on commodities by fusing knowledge maps
CN110472065A (en) Across linguistry map entity alignment schemes based on the twin network of GCN
CN109871504B (en) Course recommendation system based on heterogeneous information network and deep learning
Gaunt et al. Training deep neural nets to aggregate crowdsourced responses
CN113886598A (en) Knowledge graph representation method based on federal learning
Zhang Log-linear models of erroneous list data
CN113361928A (en) Crowdsourcing task recommendation method based on special-pattern attention network
CN115510286A (en) Multi-relation cognitive diagnosis method based on graph convolution network
Berti et al. Central limit theorems for an Indian buffet model with random weights
CN104035978A (en) Association discovering method and system
Stienstra Making global connections among women, 1970–1999
Lobel et al. Preliminary results on social learning with partial observations
Chen et al. The best learning order inference based on blue-red trees of rule-space model for social network
CN116108189A (en) Quaternion-based knowledge graph embedding method and system
CN114528333A (en) Test question implicit knowledge attribute association mining and related test question pushing method and system based on attribute exploration
CN114492852A (en) Data matching method and device based on federal learning
Liu et al. Discovery of association rule of learning action based on Bayesian network
CN118643171A (en) Knowledge graph-based large model instruction data set generation method and system
Rani et al. Recommendation system for under graduate students using FSES-TOPSIS
Yadav et al. Multifunctional And Function-Oriented Parliamentary Streamlining Structure For Community Identification On Social Media
Zheng Learning Dependence from Large-scale Data: Addressing Statistical and Optimization Challenges
Qin et al. Graph Attention-Enhanced Knowledge Tracing: Unveiling Exercise Variability and Long-Term Dependencies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination