CN115757806A - Hyper-relation knowledge graph embedding method and device, electronic equipment and storage medium - Google Patents

Hyper-relation knowledge graph embedding method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115757806A
CN115757806A CN202211154145.8A CN202211154145A CN115757806A CN 115757806 A CN115757806 A CN 115757806A CN 202211154145 A CN202211154145 A CN 202211154145A CN 115757806 A CN115757806 A CN 115757806A
Authority
CN
China
Prior art keywords
knowledge graph
hyper
fact
training
embedded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211154145.8A
Other languages
Chinese (zh)
Other versions
CN115757806B (en
Inventor
刘宇
李勇
金德鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202211154145.8A priority Critical patent/CN115757806B/en
Publication of CN115757806A publication Critical patent/CN115757806A/en
Application granted granted Critical
Publication of CN115757806B publication Critical patent/CN115757806B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a super-relation knowledge graph embedding method, a super-relation knowledge graph embedding device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a hyper-relation knowledge graph to be embedded; based on attribute expansion operation, converting the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact; training a characterization model based on the training sample set to obtain a trained characterization model; and inputting the hyper-relational knowledge graph to be embedded into the trained representation model to obtain the embedded hyper-relational knowledge graph output by the trained representation model. The invention improves the expression capability and the prediction effect of the embedded hyper-relation knowledge graph.

Description

Hyper-relation knowledge graph embedding method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a super-relation knowledge map embedding method and device, electronic equipment and a storage medium.
Background
The knowledge graph structurally represents and stores facts in the real world in a graph form, things and concepts contained in the facts correspond to entities in the knowledge graph, and relationships among the entities correspond to edges in the knowledge graph. Recently, research has proposed mining more attribute-attribute value pairs to enhance the semantic information of triples, i.e., the hyperrelational knowledge-graph.
As known in the related art, the superrelation knowledge graph embedding method is a main method for superrelation knowledge graph completion, that is, entities (including attribute values) and relations (including attributes) in the superrelation knowledge graph are projected to a continuous low-dimensional vector space, which is called an embedded vector or a characterization vector. However, the current hyperrelational knowledge graph embedding method has the problems of limited expression capability, low prediction effect and the like.
Disclosure of Invention
The invention provides a hyper-relational knowledge graph embedding method, a hyper-relational knowledge graph embedding device, electronic equipment and a storage medium, which are used for overcoming the defects of limited expression capability and low prediction effect of the hyper-relational knowledge graph embedding method in the prior art and improving the expression capability and the prediction effect of the embedded hyper-relational knowledge graph.
The invention provides a hyper-relation knowledge graph embedding method, which comprises the following steps: acquiring a hyper-relation knowledge graph to be embedded; based on attribute expansion operation, converting the to-be-embedded hyper-relational knowledge graph into a knowledge graph corresponding to the to-be-embedded hyper-relational knowledge graph, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact; training a characterization model based on the training sample set to obtain a trained characterization model; and inputting the hyper-relational knowledge graph to be embedded into the trained representation model to obtain the embedded hyper-relational knowledge graph output by the trained representation model.
According to the super-relation knowledge graph embedding method provided by the invention, the attribute expansion operation comprises a star-shaped attribute expansion operation, the super-relation knowledge graph comprises a plurality of super-relation facts, the super-relation facts comprise core triples and attribute-attribute value pairs, and the core triples comprise head entities, tail entities and relations; the expanding operation based on the attributes converts the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, and specifically comprises the following steps: introducing an intermediate entity and two intermediate relations; connecting the intermediate entity with the head entity and the tail entity respectively based on the intermediate relationship through the star attribute unfolding operation; and connecting the intermediate entity with the attribute value entity in the attribute-attribute value pair based on the attribute relationship in the attribute-attribute value pair, and connecting the head entity with the tail entity based on the relationship in the core triple to jointly obtain the knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
According to the super-relation knowledge graph embedding method provided by the invention, the attribute unfolding operation comprises a bulk attribute unfolding operation, the super-relation knowledge graph comprises a plurality of super-relation facts, the super-relation facts comprise core triples and attribute-attribute value pairs, and the core triples comprise head entities, tail entities and relations; the operation of expanding based on attributes converts the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, and specifically comprises the following steps: introducing four intermediate relationships, wherein the intermediate relationships are determined according to the relationships in the core triplets, and the intermediate relationships include a first intermediate relationship, a second intermediate relationship, a third intermediate relationship and a fourth intermediate relationship; connecting, by the blob attribute unfolding operation, the head entity and an attribute value entity of the attribute-attribute value pair based on the first intermediate relationship and the second intermediate relationship; and connecting the tail entity and the attribute value entity in the attribute-attribute value pair based on the third intermediate relationship and the fourth intermediate relationship, and connecting the head entity and the tail entity based on the relationship in the core triple to jointly obtain the knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
According to the super-relation knowledge graph embedding method provided by the invention, the training sample set comprises a positive training sample set and a negative training sample set; the obtaining a training sample set corresponding to the fact tuples based on the fact tuples corresponding to the fact specifically includes: determining a fact tuple corresponding to the fact as the set of positive training samples corresponding to the fact tuple; and replacing a head entity and/or a tail entity in the fact multi-tuple based on the fact multi-tuple corresponding to the fact to obtain the negative training sample set corresponding to the fact multi-tuple.
According to the hyper-relational knowledge graph embedding method provided by the invention, the training sample set comprises training fact tuples corresponding to the fact tuples, and the training fact tuple comprises a training entity and a training relation; training the characterization model based on the training sample set to obtain a trained characterization model, specifically comprising: inputting the training sample set into a characterization model, and performing characterization updating on the training entities and the training relations in the training sample set based on an encoder in the characterization model to obtain updated training entity characterization vectors and updated training relation characterization vectors; inputting the updated training entity characterization vector and the updated training relationship characterization vector to a decoder to obtain a fact score corresponding to the training fact tuple, wherein the decoder is based on the scoring function of the hyper-relationship knowledge graph to be embedded; obtaining a loss function based on the fact score; and updating parameters of the characterization model based on the loss function to obtain the trained characterization model.
According to the super-relation knowledge graph embedding method provided by the invention, the training fact tuples comprise positive training fact tuples corresponding to a positive training sample set and negative training fact tuples corresponding to a negative training sample set, and the fact scores comprise positive training fact scores corresponding to the positive training fact tuples and negative training fact scores corresponding to the negative training fact tuples; the derived loss function is determined using the following formula based on the fact score:
Figure BDA0003857737050000041
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003857737050000042
representing the loss function; phi (x) represents the positive training fact score; phi (x') represents the negative training fact score;
Figure BDA0003857737050000043
a negative training sample set representing the fact tuple x constructed for the ith position; x' represents the negative training fact tuple.
The invention also provides a superrelation knowledge graph embedding device, which comprises: the acquisition module is used for acquiring the hyper-relation knowledge graph to be embedded; the conversion module is used for converting the to-be-embedded hyper-relational knowledge graph into a knowledge graph corresponding to the to-be-embedded hyper-relational knowledge graph based on attribute expansion operation, wherein the knowledge graph comprises a plurality of facts; the processing module is used for obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact; the training module is used for training the characterization model based on the training sample set to obtain a trained characterization model; and the generation module is used for inputting the hyper-relation knowledge graph to be embedded into the trained representation model to obtain the post-embedding hyper-relation knowledge graph output by the trained representation model.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the superrelation knowledge graph embedding method.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a hyper-relational knowledge graph embedding method as any one of the above.
The present invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a hyper-relational knowledge graph embedding method as any one of the above.
The hyper-relational knowledge graph embedding method, the hyper-relational knowledge graph embedding device, the electronic equipment and the storage medium can convert the hyper-relational knowledge graph to be embedded into a regular knowledge graph corresponding to the hyper-relational knowledge graph to be embedded through attribute expansion operation. Because the quality of the embedded hyper-relational knowledge graph can be improved by embedding the hyper-relational knowledge graph based on the conventional knowledge graph, in the invention, a training sample set is obtained based on a fact multi-element group corresponding to the fact in the conventional knowledge graph, a trained characterization model is obtained based on training of the training sample set, the hyper-relational knowledge graph to be embedded is input to the trained characterization model, and the embedded hyper-relational knowledge graph output by the trained characterization model is obtained. The invention improves the expression capability and the prediction effect of the embedded hyper-relation knowledge graph.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is one of the flow diagrams of the hyper-relational knowledge-graph embedding method provided by the present invention;
FIG. 2 is one of the flow diagrams of the present invention for transforming a hyper-relational knowledge-graph to be embedded into a knowledge-graph corresponding to the hyper-relational knowledge-graph to be embedded based on an attribute unfolding operation;
FIG. 3 is a schematic representation of a knowledgegraph provided in accordance with the present invention about the Kth superrelationship fact in a superrelationship knowledgegraph;
FIG. 4 is a schematic view of a knowledgegraph obtained by a star attribute expansion operation on the Kth hyper-relational fact in the hyper-relational knowledgegraph provided by the present invention;
FIG. 5 is a second schematic flow chart of the present invention for transforming the hyper-relational knowledge base to be embedded into a knowledge base corresponding to the hyper-relational knowledge base to be embedded based on the attribute unfolding operation;
FIG. 6 is a schematic view of a knowledgegraph obtained by performing blob attribute unfolding operations on the Kth hyper-relational event in the hyper-relational knowledgegraph provided by the present invention;
FIG. 7 is a schematic flow chart of training a characterization model based on a training sample set to obtain a trained characterization model according to the present invention;
FIG. 8 is a second flowchart of the hyper-relational knowledge-graph embedding method according to the present invention;
FIG. 9 is a schematic structural diagram of a hyper-relational knowledge-graph embedding apparatus provided by the present invention;
fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Knowledge Graph (knowledgegraph) is a Graph-type structured representation and storage of facts in the real world, wherein things and concepts contained in the facts correspond to entities in the Knowledge Graph, and relationships among the entities correspond to edges in the Knowledge Graph. Knowledge-graph mainly describes facts in triplets (s, r, o), where s and o represent the head and tail entities, respectively, and r represents the relationship between the two.
For a Hyper-relational Knowledge Graph (Hyper-relational Knowledge Graph), one of the Hyper-relational facts may be represented as
Figure BDA0003857737050000061
(s, r, o) corresponds to a core triplet,
Figure BDA0003857737050000062
and the corresponding attribute-attribute value description set contains n attribute-attribute value pairs.
Since the hyper-relational knowledge graph is not complete and there are a lot of facts missing, it is necessary to infer the relation/attribute between the missing entities/attribute values (hereinafter, the entities include s, o, v, etc., and the relations include r, a, etc.) based on the existing hyper-relational knowledge graph information, that is, the hyper-relational knowledge graph completion.
The hyper-relation knowledge graph embedding method provided by the invention can be used for complementing the hyper-relation knowledge graph. The hyper-relation knowledge graph embedding method is based on attribute expansion, entities and relations are embedded into a low-dimensional continuous vector space, and the hyper-relation knowledge graph to be embedded is expanded into a common knowledge graph through attribute expansion operation; and utilizing an encoder based on a multi-relation graph convolution network to carry out propagation and embedded representation updating on the unfolded knowledge graph spectrum. Further, the entity and relation representation vector output by the encoder are input into a decoder based on the scoring function of the existing hyper-relation knowledge graph, and the possibility of forming a hyper-relation fact between a plurality of entities and corresponding relations is calculated. According to the method, through the learning of the input information of the hyper-relational knowledge graph to be embedded, the representation vectors of the entities and the relations can be obtained, and then the completion of the whole hyper-relational knowledge graph is completed.
Based on the embedding method of the hyper-relational knowledge graph, the embedded hyper-relational knowledge graph output by the representation model can be applied to various scenes, for example, completion problems, recommendation problems, question and answer searching problems and the like about entities to be embedded in the hyper-relational knowledge graph can be solved.
In one embodiment, the hyperrelational knowledge graph embedding method provided by the invention can be used for completing missing relations (corresponding to the hyperrelational knowledge graph to be embedded) in questions to be asked and answered in a search question and answer scene, so that the completion of the questions to be asked and answered (corresponding to the embedded hyperrelational knowledge graph after embedding) is realized. Furthermore, intelligent search can be performed based on the completed questions to be asked and answered, so that answers or answers obtained by searching based on the completed questions to be asked and answered are more accurate.
In an example, a to-be-embedded hyper-relational knowledge graph about a to-be-asked question may be constructed based on the to-be-asked question. And converting the hyper-relation knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relation knowledge graph to be embedded through attribute expansion operation. The knowledge graph comprises a plurality of facts, and the facts can correspond to keywords in the question to be asked and answered.
Furthermore, a training sample set corresponding to the fact tuples is obtained based on the fact tuples corresponding to the facts, and then the characterization model is trained based on the training sample set to obtain a trained characterization model. In the application process, the to-be-embedded hyper-relational knowledge graph constructed based on the to-be-asked questions can be input into the trained characterization phenotype, so that the embedded hyper-relational knowledge graph output by the trained characterization model can be obtained.
It should be noted that the embedded superrelationship knowledge graph is obtained by supplementing missing relationships based on the existing facts and relationships in the superrelationship knowledge graph to be embedded. And then accurate semantic information about the question to be asked and answered can be obtained based on the embedded meta-relation knowledge graph, so that more accurate answers or answers about the question to be asked and answered can be obtained based on more accurate semantic information searching.
In yet another embodiment, the entity to be embedded in the superrelationship knowledge graph may be about a user and the user's learning experience. In the application process, a hyper-relational knowledge graph to be embedded about a user can be constructed based on the learning experience information of the user. And converting the hyper-relation knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relation knowledge graph to be embedded through attribute expansion operation. Wherein the knowledge graph includes a plurality of facts that may correspond to key information of the learning experience of the user, such as a Hospital and a graduate school.
Furthermore, a training sample set corresponding to the fact tuples is obtained based on the fact tuples corresponding to the facts, and then the characterization model is trained based on the training sample set to obtain a trained characterization model. In the application process, the hyper-relational knowledge graph to be embedded, which is constructed based on the learning experience of the user, can be input into the trained characterization phenotype, so that the post-embedding hyper-relational knowledge graph output by the trained characterization model can be obtained.
It should be noted that the embedded superrelation knowledge graph is obtained by supplementing missing relations based on the existing facts and relations in the superrelation knowledge graph to be embedded. And then the link established by the way of the learning experiences of the user and the user (for example, the connection established by the local degree certificate of the A specialty and the local colleges and the connection established by the research student degree certificate of the A specialty and the research student colleges) can be obtained based on the embedded meta-relation knowledge graph, so that the completion of the learning experiences of the user is realized, and a basis is laid for comprehensively obtaining the information of the user.
To further describe the superrelationship knowledge graph embedding method provided by the present invention, the following description is made with reference to fig. 1.
FIG. 1 is a schematic flow diagram of a hyper-relational knowledge graph embedding method provided by the invention.
In an exemplary embodiment of the present invention, as can be seen in fig. 1, the hyper-relational knowledge graph embedding method may include steps 110 to 150, which will be described separately below.
In step 110, a hyper-relational knowledge graph to be embedded is obtained.
In an embodiment, the hyper-relational knowledge graph to be embedded may be a hyper-relational knowledge graph lacking the relationship/attribute between the entities/attribute values, that is, the hyper-relational knowledge graph to be embedded is not complete.
In step 120, based on the attribute unfolding operation, the hyper-relational knowledge graph to be embedded is converted into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, wherein the knowledge graph comprises a plurality of facts.
In one embodiment, the hyper-relational knowledge graph to be embedded can be transformed according to attribute unfolding operation to obtain a regular knowledge graph. As the regular-knowledge-graph-based embedding of the hyper-relation knowledge graph can apply a more classical and effective embedding scheme, the quality of the embedded hyper-relation knowledge graph can be ensured, and the expression capability and the prediction effect of the embedded hyper-relation knowledge graph are improved.
In step 130, based on the fact tuples corresponding to the facts, a training sample set corresponding to the fact tuples is obtained.
In step 140, the characterization model is trained based on the training sample set to obtain a trained characterization model.
In one embodiment, a plurality of facts in the knowledge graph obtained after the hyper-relational knowledge graph to be embedded is unfolded may be utilized to obtain a plurality of fact tuples corresponding to the facts and a training sample set corresponding to the fact tuples. Further, the characterization model can be trained based on the training sample set to obtain a trained characterization model. The representation model after training can be understood as being based on the hyper-relation knowledge graph to be embedded as input, and the embedded hyper-relation knowledge graph corresponding to the hyper-relation knowledge graph to be embedded can be obtained, namely completing the completion of the hyper-relation knowledge graph to be embedded. In the embodiment, the training sample set is directly obtained based on the fact information of the hyper-relation knowledge-graph to be embedded, and the characterization model is trained based on the training sample set, so that the cost for obtaining the training sample set can be saved, and the training efficiency is improved.
It should be noted that the characterization model may be understood as a characterization model developed based on the attributes.
In another embodiment, the characterization model may also be trained by using other training sample sets, that is, training sample sets not derived from facts to be embedded into the superrelation knowledge graph, to obtain a trained characterization model.
In step 150, the hyper-relational knowledge graph to be embedded is input to the trained representation model, and the embedded hyper-relational knowledge graph output by the trained representation model is obtained.
In one embodiment, the hyper-relational knowledge graph to be embedded may be input to the trained characterization model to obtain the post-embedding hyper-relational knowledge graph. Through the embodiment, the input hyper-relational knowledge graph to be embedded is learned to obtain the representation vectors of the entities and the relations, so that the completion of the whole hyper-relational knowledge graph to be embedded is completed, and the expression capability and the prediction effect of the embedded hyper-relational knowledge graph are improved.
According to the hyper-relational knowledge graph embedding method, the hyper-relational knowledge graph to be embedded can be converted into a conventional knowledge graph corresponding to the hyper-relational knowledge graph to be embedded through attribute unfolding operation. Because the quality of the embedded hyper-relational knowledge graph can be improved by embedding the hyper-relational knowledge graph based on the conventional knowledge graph, in the invention, a training sample set is obtained based on a fact multi-element group corresponding to the fact in the conventional knowledge graph, a trained characterization model is obtained based on training of the training sample set, the hyper-relational knowledge graph to be embedded is input to the trained characterization model, and the embedded hyper-relational knowledge graph output by the trained characterization model is obtained. The invention improves the expression capability and the prediction effect of the embedded hyper-relation knowledge graph.
The conversion of the hyper-relational knowledge graph to be embedded into a general knowledge graph can be realized by a Star-based Attribute-aware Expansion (Star-based Attribute-aware Expansion) operation and a Clique-based Attribute-aware Expansion (Clique-based Attribute-aware Expansion) operation, which will be described below.
FIG. 2 is one of the flow diagrams of the present invention for transforming a hyper-relational knowledge-graph to be embedded into a knowledge-graph corresponding to the hyper-relational knowledge-graph to be embedded based on the attribute unfolding operation.
In an exemplary embodiment of the present invention, the attribute unrolling operation may include a star attribute unrolling operation. The super-relation knowledge graph can comprise a plurality of super-relation facts, the super-relation facts can comprise core triples and attribute-attribute value pairs, and the core triples can comprise head entities, tail entities and relations. As can be seen from fig. 2, converting the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded based on the attribute unfolding operation may include steps 210 to 240, which will be described below.
In step 210, an intermediate entity and two intermediate relationships are introduced.
In one embodiment, as can be seen in conjunction with FIG. 3, for the kth superrelationship fact in the superrelationship knowledgegraph (corresponding to the superrelationship knowledgegraph to be embedded), it is assumed without loss of generality that it contains core triples (s, r, o) and 2 attribute-attribute value pair descriptions (a) 1 ,v 1 ) And (a) 2 ,v 2 )。
In another embodiment, as can be seen from fig. 4, the hyper-relational knowledge graph shown in fig. 3 may be subjected to a star attribute expansion process to obtain a structural schematic diagram in the general knowledge graph. In one example, for the kth hyper-relational fact ((s, r, o), { (a) } 1 ,v 1 ),(a 2 ,v 2 ) }) star attribute expansion introduces an intermediate entity b k Introducing 2 relations (corresponding intermediate relations) r s And r o . Wherein the intermediate relationship may be determined according to the relationship of the introduced intermediate entity with the entity in the kth superrelationship fact.
In step 220, the intermediate entities are connected with the head entity and the tail entity respectively based on the intermediate relationships through the star attribute expansion operation.
In step 230, the intermediate entity is connected with the attribute value entity in the attribute-attribute value pair based on the attribute relationship in the attribute-attribute value pair.
In step 240, the head entity and the tail entity are connected based on the relationship in the core triple, and a knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded is obtained together.
In one embodiment, continuing with the description of FIG. 4, an intermediate entity b may be provided k Respectively with head entity s and tail entity o by r s And r o Connecting and connecting the intermediate entities b k And attribute value entity v i By attribute relationship a i The connection is made. In particular, to distinguish core triples (correspondences (s, r, o)) from the semantics of attribute-attribute value description sets, a head entity s and a tail entity o may be passed throughThe relationships in the core triples are concatenated. Thus, a hyperrelational fact k can be converted into a local sub-graph structure on a general knowledge graph.
In yet another embodiment, the hyper-relational knowledge graph (corresponding to the hyper-relational knowledge graph to be embedded) is expanded by a star attribute expansion operation
Figure BDA0003857737050000121
Conversion to general knowledge map
Figure BDA0003857737050000122
Can also be realized by the following algorithm:
(1) Input hyper-relational knowledge graph
Figure BDA0003857737050000123
Wherein the converted general knowledge graph can be represented as
Figure BDA0003857737050000124
And is
Figure BDA0003857737050000125
(2) Statistics F H The relations in all the core triples in the tree are marked as a relation set R pri
(3) Pair relation set R pri Each relationship r in (1) defines a new relationship r s And r o And expand the relationship set R ← R { [ R ] } s ,r o };
(4) To F H K =1,2, \ 8230;, | FH | superrelation facts
Figure BDA0003857737050000126
The following steps are performed:
(a) Defining intermediate entities b k Adding it to entity set E ← E & { b } k };
(b)F←F∪{(s,r,o),(b k ,r s ,s),(b k ,r o ,o)};
(c)
Figure BDA0003857737050000127
(5) Outputting the converted general knowledge graph
Figure BDA0003857737050000128
Fig. 5 is a second schematic flow chart of converting the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded based on the attribute expansion operation according to the present invention.
In an exemplary embodiment of the present invention, the property expansion operation may include a blob property expansion operation. The super-relation knowledge graph can comprise a plurality of super-relation facts, the super-relation facts can comprise core triples and attribute-attribute value pairs, and the core triples can comprise head entities, tail entities and relations. As can be seen from fig. 5, based on the attribute unfolding operation, converting the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded may include steps 510 to 540, which will be described below.
In step 510, four intermediate relationships are introduced, where the intermediate relationships may be determined according to the relationships in the core triplets, and the intermediate relationships include a first intermediate relationship, a second intermediate relationship, a third intermediate relationship, and a fourth intermediate relationship.
In an embodiment, as can be seen from fig. 6, the superrelation knowledge-graph shown in fig. 3 may be subjected to blob attribute expansion processing to obtain a structural diagram in the general knowledge-graph. In one example, for the kth superrelational fact ((s, r, o) { (a) } 1 ,v 1 ),(a 2 ,v 2 ) }) of the object property expansion introduces four intermediate relationships
Figure BDA0003857737050000131
And
Figure BDA0003857737050000132
the intermediate relationship can be determined according to the relationship in the core triples, and the intermediate relationship includes a first intermediateRelationships between
Figure BDA0003857737050000133
Second intermediate relationship
Figure BDA0003857737050000134
Third intermediate relationship
Figure BDA0003857737050000135
And a fourth intermediate relationship
Figure BDA0003857737050000136
Wherein the first intermediate relationship
Figure BDA0003857737050000137
Is used to describe the head entity s in the kth super relation fact and the attribute value entity v in the attribute-attribute value pair 1 The relationship between; second intermediate relationship
Figure BDA0003857737050000138
Is used to describe the head entity s in the kth super relation fact and the attribute value entity v in the attribute-attribute value pair 2 The relationship between; third intermediate relationship
Figure BDA0003857737050000139
Is used to describe the tail entity o in the kth super relation fact and the attribute value entity v in the attribute-attribute value pair 1 The relationship between; fourth intermediate relationship
Figure BDA00038577370500001310
Is used to describe the tail entity o in the kth super relation fact and the attribute value entity v in the attribute-attribute value pair 2 The relationship between them.
In step 520, the head entity and the attribute value entity in the attribute-attribute value pair are connected based on the first intermediate relationship and the second intermediate relationship by a blob attribute unrolling operation.
In step 530, the tail entity and the attribute value entity in the attribute-attribute value pair are connected based on the third intermediate relationship and the fourth intermediate relationship.
In step 540, the head entity and the tail entity are connected based on the relationship in the core triples to jointly obtain the knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
In one embodiment, continuing with FIG. 6, the head entity s and the tail entity o may be connected by r; a head entity s and an attribute value entity v may be associated with 1 And v 2 Respectively through the first intermediate relationship
Figure BDA0003857737050000141
And a second intermediate relationship
Figure BDA0003857737050000142
Connecting; the tail entity o and the attribute value entity v can be combined 1 And v 2 Respectively through a third intermediate relationship
Figure BDA0003857737050000143
And a fourth intermediate relationship
Figure BDA0003857737050000144
The connection is made. Thus, a hyperrelational fact k can be converted into a local sub-graph structure on a general knowledge graph.
In yet another embodiment, the hyper-relational knowledge graph (corresponding to the hyper-relational knowledge graph to be embedded) is expanded by blob attributes
Figure BDA0003857737050000145
Conversion to general knowledge map
Figure BDA0003857737050000146
Can also be realized by the following algorithm:
(1) Input hyper-relational knowledge graph
Figure BDA0003857737050000147
Wherein the converted general knowledge graph can be represented as
Figure BDA0003857737050000148
And is
Figure BDA0003857737050000149
(2) Statistics F H The relations in all the core triples in the tree are marked as a relation set R pri
(3) Statistics F H All attribute-attribute values in the set describe the relationship in the set and are marked as a relationship set R att
(4)R′←R′∪R pri
(5) To the relation set R att Each relationship a in (a) defines a new relationship
Figure BDA00038577370500001410
And
Figure BDA00038577370500001411
and expanding the set of relationships
Figure BDA00038577370500001412
(6) To F is aligned with H K =1,2, \8230 |, | FH | superrelation facts
Figure BDA00038577370500001413
The following steps are performed:
Figure BDA00038577370500001414
(7) Outputting the converted general knowledge graph
Figure BDA00038577370500001415
In order to train the representation model and effectively complement the to-be-embedded hyper-relational fact map based on the trained representation model, the representation model can be trained through a training sample set. In an example, the training sample set may include a positive training sample set and a negative training sample set.
In one embodiment, based on the fact tuples corresponding to the facts, obtaining the training sample set corresponding to the fact tuples may be implemented as follows: taking a fact tuple corresponding to the fact as a positive training sample set corresponding to the fact tuple; and replacing a head entity and/or a tail entity in the fact multi-tuple based on the fact multi-tuple corresponding to the fact to obtain a negative training sample set corresponding to the fact multi-tuple.
In one example, a plurality of facts corresponding to a plurality of facts in the transformed knowledge-graph may be grouped into a plurality of facts
Figure BDA0003857737050000151
Constructs a negative sample by replacing the entity at the 1 st position in the negative sample set constructed by each superrelation fact data
Figure BDA0003857737050000152
By analogy, entities in different locations (e.g., head entities and/or tail entities) can be replaced to construct corresponding negative sample sets. In the embodiment, the training sample set is directly obtained based on the fact information of the hyper-relation knowledge-graph to be embedded, and then the characterization model is trained based on the training sample set, so that the cost for obtaining the training sample set can be saved, and the training efficiency is improved.
To further introduce the method for embedding a hyper-relational knowledge graph provided by the present invention, a process of training a characterization model based on a training sample set to obtain a trained characterization model will be described below with reference to fig. 7.
In an exemplary embodiment of the invention, the set of training samples may include training fact tuples corresponding to the fact tuples, and the training fact tuples may include training entities and training relationships. Referring to fig. 7, training the characterization model based on the training sample set to obtain a trained characterization model may include steps 710 to 740, which are described below.
In step 710, the training sample set is input to the characterization model, and based on an encoder in the characterization model, the characterization of the training entities and the training relationships in the training sample set is updated, so as to obtain updated training entity characterization vectors and updated training relationship characterization vectors.
In one embodiment, the characterization model may be trained based on a training sample set obtained from a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded. In an example, the transformed knowledge-graph may be used as input by obtaining tokens (which may correspond to updated training entity token vectors and updated training relationship token vectors) about entities and relationships on the knowledge-graph based on an encoder in the token model.
It should be noted that, in the application process, all entities and relationships in the hyper-relational knowledge graph to be embedded may be subjected to low-dimensional vector control embedding, where each entity and relationship may be characterized by one random initial vector. Further, the encoder model parameters may be initialized based on the dimensions of the embedded vector and the pre-given hyper-parameters.
In one embodiment, for the knowledge graph obtained by expanding the shape attributes, the characterization of the entities and the relationships on the knowledge graph (which may correspond to the updated training entity characterization vector and the updated training relationship characterization vector) may be implemented by:
(1) Initializing a token, namely psi (b) as a relation mapping from the entity b to a core triple of the superrelation facts related to the entity b for the intermediate entity b, and initializing the token to be
Figure BDA0003857737050000161
Figure BDA0003857737050000162
So that
Figure BDA0003857737050000163
Where d is the characterization vector dimension, e ψ(b) Representing a shared representation vector of intermediate entities having the same core triplet relationship, e b Are intermediate entity independent token vectors. For entity t which is not an intermediate entity, initializing into an independent characterization vector
Figure BDA0003857737050000164
(2) And (3) message calculation: for a triplet (u, r, t), the message for the target entity t is computed as
Figure BDA0003857737050000165
Wherein the content of the first and second substances,
Figure BDA0003857737050000166
corresponding entities and relationships are represented at the l-th layer graph convolution network's characterization vector, MSG represents a message computation function, which, in one example,
Figure BDA0003857737050000167
(3) Message aggregation: performing additive aggregation SUM on messages in the neighborhood of a target entity t, wherein an expression is shown as a formula (1):
Figure BDA0003857737050000168
wherein the content of the first and second substances,
Figure BDA0003857737050000169
a set of neighborhood entities representing entity t with respect to relation r.
(4) And (3) characterization updating: binding entity characterization at previous layer
Figure BDA00038577370500001610
And updating the entity characterization by a nonlinear updating function UPD (in the invention, the full connection layer) to obtain the l +1 th layer characterization, wherein the expression is as shown in a formula (2):
Figure BDA00038577370500001611
in particular, the relational characterization for each layer can be updated by a linear mapping, as shown in equation (3):
Figure BDA00038577370500001612
based on the above, the embodiment realizes the representation updating of the entity and the relation on the input knowledge graph (corresponding to the updated training entity representation vector and the updated training relation representation vector), and further lays a foundation for the training of the representation model.
It should be noted that, for blob attribute unfolding, the present invention does not introduce intermediate entities, so the initialization of characterization in step (1) randomly initializes all entities and relationship characterization vectors.
In step 720, the updated training entity characterization vector and the updated training relationship characterization vector are input to a decoder to obtain a fact score corresponding to the training fact tuple, where the decoder is a decoder based on a scoring function to be embedded into the hyper-relational knowledge graph.
In one embodiment, the determination of the decoder model parameters may also be performed in accordance with the determination of the encoder model parameters. In the application process, after the encoder obtains the characterization vectors of the entities and the relations contained in the hyper-relational knowledge graph to be embedded, in this embodiment, the decoder may be designed based on the scoring function of the existing hyper-relational knowledge graph (for example, the hyper-relational knowledge graph to be embedded), so as to measure the possibility that the entities and the relations constitute the hyper-relational fact.
In an example, superrelation facts are targeted
Figure BDA0003857737050000171
For an encoder consisting of a graph convolution network of the L layer, the entity and relationship characterizing vectors input to the decoder are noted
Figure BDA0003857737050000172
The encoder scoring function based on the existing scoring function n-DistMult can be expressed as formula (4):
Figure BDA0003857737050000173
wherein, the first and the second end of the pipe are connected with each other,<·>represents a multi-linear product operation, satisfies<h 1 ,h 2 ,…,h n >=∑ i h 1 [i]·h 2 [i]…h n [i]Function of
Figure BDA0003857737050000174
Representing pooling operations on relationship and attribute vectors such as bitwise summation/averaging, etc.
It should be noted that the hyper-relational knowledge graph embedding scheme based on attribute expansion proposed by this embodiment is not limited to the above described decoder structure of the scoring function, and any scoring function of the hyper-relational knowledge graph may be used as the decoder part of this embodiment.
In step 730, based on the fact score, a loss function is derived.
In one embodiment, the training fact tuple set may include a positive training fact tuple corresponding to the positive training sample set and a negative training fact tuple corresponding to the negative training sample set, and the fact score may include a positive training fact score corresponding to the positive training fact tuple and a negative training fact score corresponding to the negative training fact tuple, wherein based on the fact score, the derived loss function may be determined using equation (5):
Figure BDA0003857737050000181
wherein the content of the first and second substances,
Figure BDA0003857737050000182
representing a loss function; φ (x) represents a positive training fact score; φ (x') represents a negative training fact score;
Figure BDA0003857737050000183
a negative training sample set representing the fact tuple x constructed for the ith position; x' represents a negative training fact tuple.
In step 740, the characterization model is updated based on the loss function to obtain a trained characterization model.
In one embodiment, the updating of the parameters of the characterization model may be completed when the convergence condition is reached. Furthermore, a trained representation model can be obtained based on parameter updating, and an embedded hyper-relational knowledge graph corresponding to the hyper-relational knowledge graph to be embedded is obtained based on the trained representation model, so that the hyper-relational knowledge graph is completed. The convergence condition may be that the performance of the training model on the verification set does not rise or reaches the maximum number of iterations for a plurality of consecutive iterations. In one example, a threshold may be set according to a trained model, and the superrelation knowledge graph embedding method in the present invention calculates a score for a given superrelation fact, and if the score is higher than the threshold, the given fact is considered to be true, and the score is positively correlated with the possibility that the fact is true.
It should be noted that the hyper-relational knowledge graph completion problem can be simplified into a link prediction problem, and is expressed in the following form: knowing a missing hyper-relational knowledge graph to be embedded
Figure BDA0003857737050000184
The contained entity set is marked as E, the relation set is marked as R, wherein the known superrelation facts are expressed as a set
Figure BDA0003857737050000185
Where i denotes the ith fact, s i ∈E,o i ∈E,v ij ∈E,r i ∈R,a ij E.g. R. The link prediction task needs to infer missing facts in the hyper-relational knowledge graph to be embedded from the observed facts, such as prediction ((s, r 1 ,v 1 ),(a 2 ,v 2 ) }) objective (tail) entities of missing core triples.
To further describe the hyper-relational knowledge graph embedding method provided by the present invention, the following description will be made with reference to fig. 8.
FIG. 8 is a second flowchart of the method for embedding a hyper-relational knowledge-graph according to the present invention.
In an exemplary embodiment of the present invention, as can be seen in fig. 8, the hyper-relational knowledge graph embedding method may include steps 810 to 860, which will be described separately below.
In step 810, a hyper-relational knowledge graph to be embedded and initialization parameters are input.
In one embodiment, the description may be given by taking a WikiPeope dataset as an example of a hyper-relational knowledge graph to be embedded. The WikiPeople data set comprises entities (47765), relations (707), attribute-attribute value description set elements (2-9), training set entries (305725), verification set entries (38223) and test set entries (38281). The process of learning hyper-relational knowledge graph embedding and completing link prediction of the invention is described by taking WikiPeople as an example.
In an example, the input hyper-relational knowledge graph to be embedded may be initialized, that is, one low-dimensional vector with a length of d is randomly generated for 47765 entities in the data set, and one low-dimensional vector with a length of d is randomly generated for 707 relations in the data set. Further, the model parameters of the encoder and decoder may be initialized according to the relevant parameters of the initial input.
In step 820, the hyper-relational knowledge graph to be embedded is converted into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded based on the attribute unfolding operation.
In one embodiment, the hyper-relational knowledge graph to be embedded may be
Figure BDA0003857737050000191
Transformation into generic knowledge graph through attribute unfolding operation
Figure BDA0003857737050000192
The following description will be given by taking a star attribute expansion operation as an example.
(1) Initializing a transformed generic knowledge graph
Figure BDA0003857737050000193
And is
Figure BDA0003857737050000194
(2) Statistics F H The relations in all the core triples in the tree are marked as a relation set R pri
(3) To the relation set R pri Each relationship r in (1) defines a new relationship r s And r o And expand the relationship set R ← R { [ R ] } s ,r o };
(4) To F H K =1,2, \8230 |, | FH | superrelation facts
Figure BDA0003857737050000195
The following steps are performed:
defining intermediate entities b k Adding it to entity set E ← E & { b } k }
F←F∪{(s,r,o),(b k ,r s ,s),(b k ,r o ,o)};
Figure BDA0003857737050000201
(5) The converted knowledge map = (E, R, F) is output.
In step 830, message propagation and characterization updates are performed on the knowledge graph spectrum based on the encoder.
It should be noted that the encoder may be an encoder based on a multi-relation graph convolutional network.
In one embodiment, the training set may be
Figure BDA0003857737050000202
305725 records in the middle are equally divided into a plurality of batches, and data in each small batch are recorded as
Figure BDA0003857737050000203
1 small batch of data was taken in order for training. Can be directed to
Figure BDA0003857737050000204
Constructing a negative sample for each super-relation fact data in the converted knowledge graph (corresponding to the fact multi-element group in the knowledge graph), and replacing the negative sampleThe entity at the 1 st position constructs a corresponding negative sample set
Figure BDA0003857737050000205
Figure BDA0003857737050000206
(may correspond to a negative training sample set). And so on, replacing the entities (head entity and/or tail entity) at different positions to construct corresponding negative sample sets.
In yet another example, the method may comprise
Figure BDA0003857737050000207
Inputting into encoder structure based on graph convolution network, obtaining characterization matrix of entity and relationship, and obtaining updated characterization vector of entity and relationship, that is
Figure BDA0003857737050000208
Where Enc represents the encoder function, any graph convolution network structure may be employed.
In step 840, the updated entity and relationship characterization vectors are input to a decoder based on the scoring function of the hyper-relationship knowledge-graph to be embedded, and the score of each hyper-relationship fact is calculated.
In step 850, a loss function is derived based on the scores of the superrelational facts, and a characterization model is trained based on the loss function.
In one embodiment, can be according to
Figure BDA0003857737050000209
Sample of (1)
Figure BDA00038577370500002010
From
Figure BDA00038577370500002011
Selecting corresponding entity and relation representation vector, inputting into decoder structure, calculating score of corresponding super relation fact
Figure BDA00038577370500002012
Then, calculating a loss function as shown in formula (6), and updating parameters of the eigenvector and the model parameter by adopting a gradient descent algorithm:
Figure BDA0003857737050000211
in the application process, steps 820 to 850 can be repeated to complete 1 iteration on the whole training set, a link prediction task is completed on the verification set, and a Mean Reciprocal Rank (MRR) is calculated, wherein the larger the MRR is, the higher the Rank of a correct missing entity is, the higher the link prediction accuracy is. In particular, when calculating the MRR index, positive samples in the data that have been observed in addition to the test facts, i.e., the filter MRR, may be filtered out. And stopping training if the MRR indexes on the verification set do not rise for a given number of continuous iterations or the total number of iterations reaches a given limit. Otherwise, the previous steps are continuously repeated.
In step 860, the hyper-relational knowledge graph to be embedded is input to the trained characterization model, and the hyper-relational knowledge graph to be embedded is completed.
In one embodiment, the hyper-relational knowledge graph to be embedded may be complemented based on the trained characterization model. During application, the link prediction task may be completed on the test set. The comparison of the link prediction results of the hyper-relational knowledge graph to be embedded can be shown in table 1.
TABLE 1 comparison table of link prediction results of hyper-relational knowledge graph to be embedded
Figure BDA0003857737050000212
The significance of the data in table 1 is used to characterize the prediction performance index, where larger data indicates better prediction performance. From table 1, it can be found that the meta-relation knowledge graph embedding method based on attribute expansion provided by the invention is obviously improved compared with the baseline method.
According to the description, the superrelation knowledge graph embedding method provided by the invention can convert the superrelation knowledge graph to be embedded into a conventional knowledge graph corresponding to the superrelation knowledge graph to be embedded through attribute unfolding operation. According to the invention, the fact multi-component group corresponding to the fact in the knowledge graph based on the regularity is obtained as a training sample set, then the training is carried out based on the training sample set to obtain a representation model after training, the knowledge graph to be embedded into the representation model after training is input into the representation model after training, and the embedded hyper-relational knowledge graph output by the representation model after training is obtained. The invention improves the expression capability and the prediction effect of the embedded hyper-relation knowledge graph.
Based on the same conception, the invention also provides a superrelation knowledge graph embedding device.
The superrelation knowledge graph embedding device provided by the invention is described below, and the superrelation knowledge graph embedding device described below and the superrelation knowledge graph embedding method described above can be referred to correspondingly.
FIG. 9 is a schematic structural diagram of a hyper-relational knowledge-graph embedding apparatus provided by the present invention.
In an exemplary embodiment of the present invention, as can be seen in fig. 9, the hyper-relational knowledge graph embedding apparatus may include an obtaining module 910, a transforming module 920, a processing module 930, a training module 940, and a generating module 950, which will be described below.
The obtaining module 910 may be configured to obtain a hyper-relational knowledge graph to be embedded.
The conversion module 920 may be configured to perform an operation based on the attribute expansion to convert the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, where the knowledge graph may include a plurality of facts.
The processing module 930 may be configured to derive a set of training samples corresponding to a fact tuple based on the fact tuple corresponding to the fact.
The training module 940 may be configured to train the characterization model based on the training sample set, resulting in a trained characterization model.
The generating module 950 may be configured to input the hyper-relational knowledge graph to be embedded into the trained characterization model, resulting in an embedded hyper-relational knowledge graph output by the trained characterization model.
In an exemplary embodiment of the invention, the attribute expansion operation may include a star attribute expansion operation, the super-relation knowledge graph may include a plurality of super-relation facts, the super-relation facts may include core triples and attribute-attribute value pairs, and the core triples may include head entities, tail entities and relations. The conversion module 920 may convert the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded based on the attribute unfolding operation in the following manner: introducing an intermediate entity and two intermediate relations; respectively connecting the intermediate entity with the head entity and the tail entity based on the intermediate relation through star-shaped attribute unfolding operation; and connecting the intermediate entity with the attribute value entity in the attribute-attribute value pair based on the attribute relationship in the attribute-attribute value pair, and connecting the head entity and the tail entity based on the relationship in the core triple to jointly obtain the knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
In an exemplary embodiment of the invention, the attribute expansion operation may include a star attribute expansion operation, the super-relation knowledge graph may include a plurality of super-relation facts, the super-relation facts may include core triples and attribute-attribute value pairs, and the core triples may include head entities, tail entities and relations. The conversion module 920 may convert the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded based on the attribute expansion operation in the following manner: introducing four intermediate relations, wherein the intermediate relations are determined according to the relations in the core triples, and the intermediate relations comprise a first intermediate relation, a second intermediate relation, a third intermediate relation and a fourth intermediate relation; connecting the head entity and the attribute value entity in the attribute-attribute value pair based on the first intermediate relationship and the second intermediate relationship through a blob attribute unfolding operation; and connecting the tail entity with the attribute value entities in the attribute-attribute value pairs based on the third intermediate relationship and the fourth intermediate relationship, and connecting the head entity with the tail entity based on the relationship in the core triple to jointly obtain the knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
In an exemplary embodiment of the present invention, the training sample set may include a positive training sample set and a negative training sample set; the processing module 930 may derive a training sample set corresponding to a fact tuple based on the fact tuple corresponding to the fact in the following manner: taking a fact tuple corresponding to the fact as a positive training sample set corresponding to the fact tuple; and replacing a head entity and/or a tail entity in the fact multi-tuple based on the fact multi-tuple corresponding to the fact to obtain a negative training sample set corresponding to the fact multi-tuple.
In an exemplary embodiment of the invention, the training sample set may include training fact tuples corresponding to the fact tuples, and the training fact tuples may include training entities and training relationships; the training module 940 may train the characterization model based on the training sample set in the following manner to obtain a trained characterization model: inputting the training sample set into a characterization model, and performing characterization updating on training entities and training relations in the training sample set based on an encoder in the characterization model to obtain updated training entity characterization vectors and updated training relation characterization vectors; inputting the updated training entity characterization vector and the updated training relation characterization vector into a decoder to obtain a fact score corresponding to the training fact tuple, wherein the decoder is based on a scoring function of the hyper-relation knowledge graph to be embedded; based on the fact score, obtaining a loss function; and updating parameters of the characterization model based on the loss function to obtain the trained characterization model.
In an exemplary embodiment of the invention, the training fact tuple may include a positive training fact tuple corresponding to the positive training sample set and a negative training fact tuple corresponding to the negative training sample set, and the fact score may include a positive training fact score corresponding to the positive training fact tuple and a negative training fact score corresponding to the negative training fact tuple; the training module 940 may derive the loss function based on the fact score using equation (7):
Figure BDA0003857737050000241
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003857737050000242
representing a loss function; phi (x) represents a positive training fact score; phi (x') represents a negative training fact score;
Figure BDA0003857737050000243
a negative training sample set representing the fact tuple x constructed for the ith position; x' represents a negative training fact tuple.
Fig. 10 illustrates a physical structure diagram of an electronic device, and as shown in fig. 10, the electronic device may include: a processor (processor) 1010, a communication Interface (Communications Interface) 1020, a memory (memory) 1030, and a communication bus 1040, wherein the processor 1010, the communication Interface 1020, and the memory 1030 are in communication with each other via the communication bus 1040. Processor 1010 may invoke logic instructions in memory 1030 to perform a hyper-relational knowledge-graph embedding method comprising: acquiring a hyper-relation knowledge graph to be embedded; based on attribute expansion operation, converting the hyper-relationship knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the hyper-relation knowledge graph to be embedded into the trained representation model to obtain the embedded hyper-relation knowledge graph output by the trained representation model.
Furthermore, the logic instructions in the memory 1030 can be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, wherein when the computer program is executed by a processor, the computer is capable of executing the hyperstatic knowledgegraph embedding method provided by the above methods, the method comprising: acquiring a hyper-relation knowledge graph to be embedded; based on attribute expansion operation, converting the hyper-relationship knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the hyper-relation knowledge graph to be embedded into the trained representation model to obtain the embedded hyper-relation knowledge graph output by the trained representation model.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements a hyper-relational knowledge graph embedding method provided by the above methods, the method comprising: acquiring a hyper-relation knowledge graph to be embedded; based on attribute expansion operation, converting the hyper-relationship knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the hyper-relation knowledge graph to be embedded into the trained representation model to obtain the embedded hyper-relation knowledge graph output by the trained representation model.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
It is further to be understood that while operations are depicted in the drawings in a particular order, this is not to be understood as requiring that such operations be performed in the particular order shown or in serial order, or that all illustrated operations be performed, to achieve desirable results. In certain environments, multitasking and parallel processing may be advantageous.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A hyper-relational knowledge graph embedding method, the method comprising:
acquiring a hyper-relation knowledge graph to be embedded;
based on attribute expansion operation, converting the to-be-embedded hyper-relational knowledge graph into a knowledge graph corresponding to the to-be-embedded hyper-relational knowledge graph, wherein the knowledge graph comprises a plurality of facts;
obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact;
training a characterization model based on the training sample set to obtain a trained characterization model;
and inputting the hyper-relational knowledge graph to be embedded into the trained representation model to obtain the embedded hyper-relational knowledge graph output by the trained representation model.
2. The hyper-relational knowledge graph embedding method according to claim 1, wherein the attribute unfolding operations comprise star attribute unfolding operations, the hyper-relational knowledge graph comprises a plurality of hyper-relational facts, the hyper-relational facts comprise core triples and attribute-attribute value pairs, and the core triples comprise head entities, tail entities and relations;
the operation of expanding based on attributes converts the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, and specifically comprises the following steps:
introducing an intermediate entity and two intermediate relationships;
connecting the intermediate entity with the head entity and the tail entity respectively based on the intermediate relationship through the star attribute unfolding operation;
connecting the intermediate entity with an attribute value entity of the attribute-attribute value pair based on the attribute relationship in the attribute-attribute value pair, and
and connecting the head entity and the tail entity based on the relationship in the core triple to jointly obtain a knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
3. The method of claim 1, wherein the attribute unfolding operations comprise blob attribute unfolding operations, the hyper-relational knowledge graph comprises a plurality of hyper-relational facts, the hyper-relational facts comprise core triples and attribute-attribute value pairs, and the core triples comprise head entities, tail entities, and relationships;
the expanding operation based on the attributes converts the hyper-relational knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relational knowledge graph to be embedded, and specifically comprises the following steps:
introducing four intermediate relationships, wherein the intermediate relationships are determined according to the relationships in the core triplet, and the intermediate relationships include a first intermediate relationship, a second intermediate relationship, a third intermediate relationship, and a fourth intermediate relationship;
connecting, by the blob attribute unrolling operation, the head entity and an attribute value entity of the attribute-attribute value pair based on the first intermediate relationship and the second intermediate relationship;
connecting the tail entity with an attribute value entity of the attribute-attribute value pair based on the third intermediate relationship and the fourth intermediate relationship, an
And connecting the head entity and the tail entity based on the relationship in the core triple to jointly obtain a knowledge graph corresponding to the hyper-relationship knowledge graph to be embedded.
4. The method of claim 1, wherein the set of training samples comprises a set of positive training samples and a set of negative training samples;
the obtaining a training sample set corresponding to the fact tuple based on the fact tuple corresponding to the fact specifically includes:
determining a fact tuple corresponding to the fact as the set of positive training samples corresponding to the fact tuple;
and replacing a head entity and/or a tail entity in the fact multi-tuple based on the fact multi-tuple corresponding to the fact to obtain the negative training sample set corresponding to the fact multi-tuple.
5. The hyper-relational knowledge graph embedding method according to claim 1, wherein the training sample set comprises training fact tuples corresponding to the fact tuples, and the training fact tuples comprise training entities and training relations;
training the characterization model based on the training sample set to obtain a trained characterization model, specifically comprising:
inputting the training sample set into the characterization model, and performing characterization updating on the training entities and the training relations in the training sample set based on an encoder in the characterization model to obtain updated training entity characterization vectors and updated training relation characterization vectors;
inputting the updated training entity characterization vector and the updated training relationship characterization vector to a decoder to obtain a fact score corresponding to the training fact tuple, wherein the decoder is based on the scoring function of the hyper-relational knowledge graph to be embedded;
obtaining a loss function based on the fact score;
and updating parameters of the characterization model based on the loss function to obtain the trained characterization model.
6. The hypergraphical knowledge-graph embedding method of claim 5, wherein the training fact tuples comprise positive training fact tuples corresponding to a positive training sample set and negative training fact tuples corresponding to a negative training sample set, the fact scores comprising positive training fact scores corresponding to the positive training fact tuples and negative training fact scores corresponding to the negative training fact tuples;
the derived loss function is determined using the following formula based on the fact score:
Figure FDA0003857737040000031
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003857737040000032
representing the loss function; phi (x) represents the positive training fact score; phi (x') represents the negative training fact score;
Figure FDA0003857737040000033
a negative training sample set representing the fact tuple x constructed for the ith position; x' represents the negative training fact tuple.
7. A hyper-relational knowledge graph embedding apparatus, the apparatus comprising:
the acquisition module is used for acquiring the hyper-relation knowledge graph to be embedded;
the conversion module is used for converting the hyper-relation knowledge graph to be embedded into a knowledge graph corresponding to the hyper-relation knowledge graph to be embedded based on attribute expansion operation, wherein the knowledge graph comprises a plurality of facts;
the processing module is used for obtaining a training sample set corresponding to the fact multi-element group based on the fact multi-element group corresponding to the fact;
the training module is used for training the characterization model based on the training sample set to obtain a trained characterization model;
and the generation module is used for inputting the hyper-relation knowledge graph to be embedded into the trained representation model to obtain the post-embedding hyper-relation knowledge graph output by the trained representation model.
8. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the hyper-relational knowledge-graph embedding method according to any one of claims 1 to 6 when executing the program.
9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the hyper-relational knowledge graph embedding method according to any one of claims 1 to 6.
10. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the superrelational knowledge graph embedding method according to any one of claims 1 to 6.
CN202211154145.8A 2022-09-21 2022-09-21 Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium Active CN115757806B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211154145.8A CN115757806B (en) 2022-09-21 2022-09-21 Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211154145.8A CN115757806B (en) 2022-09-21 2022-09-21 Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115757806A true CN115757806A (en) 2023-03-07
CN115757806B CN115757806B (en) 2024-05-28

Family

ID=85351774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211154145.8A Active CN115757806B (en) 2022-09-21 2022-09-21 Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115757806B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182245A (en) * 2020-09-28 2021-01-05 中国科学院计算技术研究所 Knowledge graph embedded model training method and system and electronic equipment
CN113742488A (en) * 2021-07-30 2021-12-03 清华大学 Embedded knowledge graph completion method and device based on multitask learning
CN113836312A (en) * 2021-09-13 2021-12-24 中国人民解放军32801部队 Knowledge representation reasoning method based on encoder and decoder framework
CN114357177A (en) * 2021-12-08 2022-04-15 中国长城科技集团股份有限公司 Knowledge hypergraph generation method and device, terminal device and storage medium
CN114780879A (en) * 2022-03-30 2022-07-22 天津大学 Interpretable link prediction method for knowledge hypergraph
CN114817568A (en) * 2022-04-29 2022-07-29 武汉科技大学 Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network
CN114860889A (en) * 2022-05-31 2022-08-05 北京科技大学 Steel potential knowledge reasoning method and system based on steel knowledge graph
CN114942997A (en) * 2022-04-21 2022-08-26 阿里巴巴(中国)有限公司 Data processing method, model training method, risk identification method, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182245A (en) * 2020-09-28 2021-01-05 中国科学院计算技术研究所 Knowledge graph embedded model training method and system and electronic equipment
CN113742488A (en) * 2021-07-30 2021-12-03 清华大学 Embedded knowledge graph completion method and device based on multitask learning
CN113836312A (en) * 2021-09-13 2021-12-24 中国人民解放军32801部队 Knowledge representation reasoning method based on encoder and decoder framework
CN114357177A (en) * 2021-12-08 2022-04-15 中国长城科技集团股份有限公司 Knowledge hypergraph generation method and device, terminal device and storage medium
CN114780879A (en) * 2022-03-30 2022-07-22 天津大学 Interpretable link prediction method for knowledge hypergraph
CN114942997A (en) * 2022-04-21 2022-08-26 阿里巴巴(中国)有限公司 Data processing method, model training method, risk identification method, equipment and storage medium
CN114817568A (en) * 2022-04-29 2022-07-29 武汉科技大学 Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network
CN114860889A (en) * 2022-05-31 2022-08-05 北京科技大学 Steel potential knowledge reasoning method and system based on steel knowledge graph

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
PAOLO ROSSO 等: "Beyond Triplets:Hyper-Relational Knowledge Grach Embedding for Link Prediction", 《WWW‘20:PROCEEDINGS OF THE WEB CONFERENCE 2020》, 20 April 2020 (2020-04-20), pages 1885 - 1896 *
杨东华 等: "面向知识图谱的图嵌入学习研究进展", 《软件学报》, vol. 33, no. 9, 8 September 2022 (2022-09-08), pages 3370 - 3390 *
耿化聪: "基于知识图谱与协同过滤的饮食推荐算法研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》, no. 2, 15 February 2022 (2022-02-15), pages 025 - 8 *
蔡淑琴;肖泉;吴颖敏;: "基于超图的知识表示及检索相似性度量研究", 《图书情报工作》, no. 8, 20 April 2009 (2009-04-20), pages 103 - 106 *
陆枫: "基于Neo4j的人员关系知识图谱构建及应用", 《软件工程》, vol. 25, no. 9, 5 September 2022 (2022-09-05), pages 5 - 8 *

Also Published As

Publication number Publication date
CN115757806B (en) 2024-05-28

Similar Documents

Publication Publication Date Title
Ridout et al. Standard modules, induction and the structure of the Temperley-Lieb algebra
CN112200321B (en) Inference method, system, device and medium based on knowledge federation and graph network
US20140204092A1 (en) Classification of high dimensional data
Zamaere et al. Jack polynomials as fractional quantum Hall states and the Betti numbers of the (k+ 1)-equals ideal
Monard et al. Inverse anisotropic conductivity from power densities in dimension n≥ 3
CN112800207B (en) Commodity information recommendation method and device and storage medium
Radenković et al. Higher gauge theories based on 3-groups
EP4290448A1 (en) Image generation model training method, generation method, apparatus, and device
Ding et al. VQ-GNN: A universal framework to scale up graph neural networks using vector quantization
CN111460165A (en) Method, apparatus, and computer-readable storage medium for knowledge-graph link prediction
CN111401514A (en) Semi-supervised symbol network embedding method and system based on improved graph convolutional network
CN111291810A (en) Information processing model generation method based on target attribute decoupling and related equipment
Gendler et al. Counting Calabi-Yau Threefolds
CN113077057B (en) Unbiased machine learning method
Guest et al. Kostant, Steinberg, and the Stokes matrices of the tt*-Toda equations
Casanellas et al. Local equations for equivariant evolutionary models
CN115757806A (en) Hyper-relation knowledge graph embedding method and device, electronic equipment and storage medium
CN116992334A (en) Academic-oriented network node classification method and device
Saporta et al. Random coefficients bifurcating autoregressive processes
Cowen et al. Consequences of universality among Toeplitz operators
CN116844008A (en) Attention mechanism guided content perception non-reference image quality evaluation method
Atsawaraungsuk et al. Evolutionary circular-ELM for the reduced-reference assessment of perceived image quality
Albeverio et al. Equivalence of the Brownian and energy representations
CN113807370A (en) Data processing method, device, equipment, storage medium and computer program product
CN112836065A (en) Prediction method of graph convolution knowledge representation learning model ComSAGCN based on combination self-attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant