CN115757806B

CN115757806B - Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium

Info

Publication number: CN115757806B
Application number: CN202211154145.8A
Authority: CN
Inventors: 刘宇; 李勇; 金德鹏
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-09-21
Filing date: 2022-09-21
Publication date: 2024-05-28
Anticipated expiration: 2042-09-21
Also published as: CN115757806A

Abstract

The invention provides a super-relationship knowledge graph embedding method, a super-relationship knowledge graph embedding device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a knowledge graph of the super relation to be embedded; converting the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to a fact multi-group based on the fact multi-group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the to-be-embedded super-relationship knowledge graph to the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model. The invention improves the expression capability and the prediction effect of the embedded super-relationship knowledge graph.

Description

Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of knowledge graph technologies, and in particular, to a method and apparatus for embedding a super-relationship knowledge graph, an electronic device, and a storage medium.

Background

The knowledge graph is to structurally represent and store facts in the real world in the form of graph, things and concepts contained in the facts correspond to entities in the knowledge graph, and relations among the entities correspond to edges in the knowledge graph. Recently, research has proposed mining more attribute-attribute value pairs to enhance the semantic information of triples, i.e., hyper-relational knowledge maps.

The related art shows that the super-relationship knowledge graph embedding method is a main method for super-relationship knowledge graph completion, namely, projecting entities (including attribute values) and relationships (including attributes) in the super-relationship knowledge graph into a continuous low-dimensional vector space, which is called an embedding vector or a characterization vector. However, the current super-relationship knowledge graph embedding method has the problems of limited expression capacity, low prediction effect and the like.

Disclosure of Invention

The invention provides a super-relationship knowledge graph embedding method, a device, electronic equipment and a storage medium, which are used for solving the defects of limited expression capacity and low prediction effect of the super-relationship knowledge graph embedding method in the prior art and improving the expression capacity and the prediction effect of the embedded super-relationship knowledge graph.

The invention provides a super-relationship knowledge graph embedding method, which comprises the following steps: acquiring a knowledge graph of the super relation to be embedded; converting the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to a fact multi-group based on the fact multi-group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the to-be-embedded super-relationship knowledge graph to the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model.

According to the super-relationship knowledge graph embedding method provided by the invention, the attribute expansion operation comprises star-shaped attribute expansion operation, the super-relationship knowledge graph comprises a plurality of super-relationship facts, the super-relationship facts comprise a core triplet and attribute-attribute value pairs, and the core triplet comprises a head entity, a tail entity and a relationship; the attribute-based unfolding operation converts the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph, and specifically comprises the following steps: introducing an intermediate entity and two intermediate relations; connecting the intermediate entity with the head entity and the tail entity based on the intermediate relation through the star attribute unfolding operation; and connecting the intermediate entity with the attribute value entity in the attribute-attribute value pair based on the attribute relationship in the attribute-attribute value pair, and connecting the head entity with the tail entity based on the relationship in the core triplet to jointly obtain a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph.

According to the super-relationship knowledge graph embedding method provided by the invention, the attribute unfolding operation comprises a bulk attribute unfolding operation, the super-relationship knowledge graph comprises a plurality of super-relationship facts, the super-relationship facts comprise a core triplet and attribute-attribute value pairs, and the core triplet comprises a head entity, a tail entity and a relationship; the attribute-based unfolding operation converts the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph, and specifically comprises the following steps: introducing four intermediate relations, wherein the intermediate relations are determined according to the relations in the core triples, and the intermediate relations comprise a first intermediate relation, a second intermediate relation, a third intermediate relation and a fourth intermediate relation; connecting the head entity and the attribute value entity in the attribute-attribute value pair based on the first intermediate relationship and the second intermediate relationship through the bulk attribute expansion operation; and connecting the tail entity with the attribute value entity in the attribute-attribute value pair based on the third intermediate relation and the fourth intermediate relation, and connecting the head entity with the tail entity based on the relation in the core triplet to jointly obtain a knowledge graph corresponding to the to-be-embedded super-relation knowledge graph.

According to the super-relationship knowledge graph embedding method provided by the invention, the training sample set comprises a positive training sample set and a negative training sample set; the training sample set corresponding to the fact multi-group is obtained based on the fact multi-group corresponding to the fact, and specifically comprises the following steps: taking the fact multi-group corresponding to the fact as the positive training sample set corresponding to the fact multi-group; and replacing a head entity and/or a tail entity in the fact multi-group based on the fact multi-group corresponding to the fact, so as to obtain the negative training sample set corresponding to the fact multi-group.

According to the super-relationship knowledge graph embedding method provided by the invention, the training sample set comprises training fact tuples corresponding to the fact tuples, and the training fact tuples comprise training entities and training relationships; training the characterization model based on the training sample set to obtain a trained characterization model, which specifically comprises the following steps: inputting the training sample set into a characterization model, and performing characterization update on the training entity and the training relation in the training sample set based on an encoder in the characterization model to obtain an updated training entity characterization vector and an updated training relation characterization vector; inputting the updated training entity characterization vector and the updated training relation characterization vector to a decoder to obtain a fact score corresponding to the training fact multi-group, wherein the decoder is a decoder based on a scoring function of the to-be-embedded super-relation knowledge graph; obtaining a loss function based on the fact score; and updating parameters of the characterization model based on the loss function to obtain the trained characterization model.

According to the super-relationship knowledge graph embedding method provided by the invention, the training fact tuples comprise positive training fact tuples corresponding to a positive training sample set and negative training fact tuples corresponding to a negative training sample set, and the fact score comprises a positive training fact score corresponding to the positive training fact tuple and a negative training fact score corresponding to the negative training fact tuple; based on the fact score, the resulting loss function is determined using the following formula:

Wherein, Representing the loss function; phi (x) represents the positive training fact score; phi (x') represents the negative training fact score; /(I)A negative training sample set representing a fact tuple x for an ith position construct; x' represents the negative training fact tuple.

The invention also provides a super-relationship knowledge graph embedding device, which comprises: the acquisition module is used for acquiring a to-be-embedded super-relationship knowledge graph; the conversion module is used for converting the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts; the processing module is used for obtaining a training sample set corresponding to the fact multi-group based on the fact multi-group corresponding to the fact; the training module is used for training the characterization model based on the training sample set to obtain a trained characterization model; and the generation module is used for inputting the to-be-embedded super-relationship knowledge graph to the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model.

The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the super-relationship knowledge graph embedding method according to any one of the above when executing the program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a super-relationship knowledge-graph embedding method as described in any one of the above.

The invention also provides a computer program product comprising a computer program which when executed by a processor implements a method of embedding a hyper-relational knowledge graph as described in any one of the above.

According to the super-relationship knowledge graph embedding method, the super-relationship knowledge graph embedding device, the electronic equipment and the storage medium, the super-relationship knowledge graph to be embedded can be converted into the conventional knowledge graph corresponding to the super-relationship knowledge graph to be embedded through attribute unfolding operation. Because the embedding of the super-relationship knowledge graph based on the conventional knowledge graph can improve the quality of the embedded super-relationship knowledge graph, a training sample set is obtained based on a fact multi-group corresponding to the facts in the conventional knowledge graph, a trained characterization model is obtained based on training of the training sample set, and the super-relationship knowledge graph to be embedded is input into the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model. The invention improves the expression capability and the prediction effect of the embedded super-relationship knowledge graph.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a super-relationship knowledge graph embedding method provided by the invention;

FIG. 2 is one of the flow diagrams of converting the knowledge graph of the super relationship to be embedded into the knowledge graph corresponding to the knowledge graph of the super relationship to be embedded based on the attribute unfolding operation provided by the invention;

FIG. 3 is a schematic diagram of knowledge graphs related to the K-th superrelation fact in the superrelation knowledge graph provided by the invention;

FIG. 4 is a schematic diagram of a knowledge graph obtained by performing star attribute expansion operation on a K-th superrelation fact in a superrelation knowledge graph;

FIG. 5 is a second flow chart of converting a knowledge graph of a super relationship to be embedded into a knowledge graph corresponding to the knowledge graph of the super relationship to be embedded based on attribute expansion operation provided by the invention;

FIG. 6 is a schematic diagram of a knowledge graph obtained by performing a bulk attribute expansion operation on a K-th superrelation fact in a superrelation knowledge graph;

FIG. 7 is a schematic flow chart of a characterization model trained based on a training sample set to obtain a trained characterization model;

FIG. 8 is a second flow chart of the method for embedding a super-relationship knowledge graph provided by the invention;

FIG. 9 is a schematic structural diagram of the super-relationship knowledge graph embedding device provided by the invention;

fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

A Knowledge Graph (knowledgegraph) is a Graph that structurally represents and stores facts in the real world in the form of a Graph, wherein things and concepts contained in the facts correspond to entities in the Knowledge Graph, and relationships among the entities correspond to edges in the Knowledge Graph. The knowledge graph mainly describes facts in triads (s, r, o), s and o respectively represent a head entity and a tail entity, and r represents a relation between the two entities.

For the super-relationship knowledge graph (Hyper-relational Knowledge Graph), one of the super-relationship facts can be expressed as(S, r, o) corresponding core triples,/>The number of attribute-attribute value pairs corresponding to the attribute-attribute value description set is n.

Since the super-relationship knowledge graph is not complete and has a great number of facts missing, it is necessary to infer the relationship/attribute between the missing entities/attribute values (hereinafter, the entities include s, o, v, etc., the relationship includes r, a, etc.) according to the existing super-relationship knowledge graph information, that is, the super-relationship knowledge graph is completed.

The super-relationship knowledge graph embedding method provided by the invention can complete super-relationship knowledge graph. The super-relationship knowledge graph embedding method provided by the invention is a super-relationship knowledge graph embedding method based on attribute expansion, wherein entities and relationships are embedded into a low-dimensional continuous vector space, and the super-relationship knowledge graph to be embedded is expanded into a general knowledge graph through attribute expansion operation; and using encoders based on the multi-relation graph convolution network to propagate and embed the characterization update on the unfolded knowledge graph. Further, the entity and the relation characterization vector output by the encoder are input into a decoder based on a scoring function of the existing super relation knowledge graph, and the possibility of forming the super relation facts between a plurality of entities and corresponding relations is calculated. According to the method, through learning the information contained in the input to-be-embedded super-relationship knowledge graph, the characterization vectors of the entity and the relationship can be obtained, and then the completion of the whole super-relationship knowledge graph is completed.

Based on the super-relationship knowledge graph embedding method provided by the invention, the super-relationship knowledge graph after embedding output by the characterization model can be used for applying various scenes, for example, the problem of completion, recommendation, search question and answer and the like of the entity to be embedded in the super-relationship knowledge graph can be solved.

In one embodiment, the method for embedding the super-relationship knowledge graph can be used for complementing the missing relationship (corresponding to the super-relationship knowledge graph to be embedded) in the question to be answered in the search question and answer scene, so that the completion (corresponding to the super-relationship knowledge graph after embedding) of the question to be answered is realized. Furthermore, intelligent searching can be performed based on the completed questions to be asked, so that answers or answers obtained based on the completed questions to be asked are more accurate.

In an example, a knowledge graph of the superrelationship to be embedded with respect to a question to be asked may be constructed based on the question to be asked. And converting the knowledge graph to be embedded into a knowledge graph corresponding to the knowledge graph to be embedded through attribute unfolding operation. The knowledge graph comprises a plurality of facts, and the facts can correspond to keywords in questions to be asked.

Further, a training sample set corresponding to the fact multi-group is obtained based on the fact multi-group corresponding to the fact, and a characterization model is trained based on the training sample set, so that a trained characterization model is obtained. In the application process, the to-be-embedded super-relationship knowledge graph constructed based on the question to be answered can be input into the trained characterization phenotype, so that the embedded super-relationship knowledge graph output by the trained characterization model can be obtained.

It should be noted that the post-embedding super-relationship knowledge graph is obtained by supplementing the missing relationship based on the fact and the relationship existing in the super-relationship knowledge graph to be embedded. And accurate semantic information about the question to be answered can be obtained based on the embedded super-relationship knowledge graph, so that more accurate answers or answers about the question to be answered can be obtained based on more accurate semantic information search.

In yet another embodiment, the entity to be embedded in the super-relationship knowledge-graph may be a learning experience with respect to a user and the user. In the application process, a to-be-embedded super-relationship knowledge graph about the user can be constructed based on learning experience information of the user. And converting the knowledge graph to be embedded into a knowledge graph corresponding to the knowledge graph to be embedded through attribute unfolding operation. Wherein a plurality of facts are included in the knowledge graph, which may correspond to key information of the user's learning experience, such as an academic institution and a research institution.

Further, a training sample set corresponding to the fact multi-group is obtained based on the fact multi-group corresponding to the fact, and a characterization model is trained based on the training sample set, so that a trained characterization model is obtained. In the application process, the to-be-embedded super-relationship knowledge graph constructed based on the learning experience of the user can be input into the trained characterization phenotype, so that the embedded super-relationship knowledge graph output by the trained characterization model can be obtained.

It should be noted that the post-embedding super-relationship knowledge graph is obtained by supplementing the missing relationship based on the fact and the relationship existing in the super-relationship knowledge graph to be embedded. And then, the connection established by the user and the learning experience of the user through what way (for example, through the A professional academic department and the academic institution, through the A professional research student department and the research institution) can be obtained based on the embedded super-relationship knowledge graph, so that the completion of the user about the learning experience is realized, and a foundation is laid for comprehensively obtaining the information of the user.

In order to further describe the super-relationship knowledge graph embedding method provided by the invention, the following description will be made with reference to fig. 1.

FIG. 1 is a schematic flow chart of the super-relationship knowledge graph embedding method provided by the invention.

In an exemplary embodiment of the present invention, as can be seen in fig. 1, the super-relationship knowledge graph embedding method may include steps 110 to 150, and each step will be described below.

In step 110, a knowledge-graph of the super-relationship to be embedded is obtained.

In an embodiment, the super-relationship knowledge graph to be embedded may be a super-relationship knowledge graph of the relationship/attribute between the missing entity/attribute values, that is, the super-relationship knowledge graph to be embedded is not complete.

In step 120, the knowledge graph to be embedded is converted into a knowledge graph corresponding to the knowledge graph to be embedded based on the attribute expansion operation, wherein the knowledge graph includes a plurality of facts.

In one embodiment, the knowledge graph to be embedded in the super-relationship can be converted according to the attribute unfolding operation, so that a conventional knowledge graph is obtained. Because the super-relationship knowledge graph is embedded based on the conventional knowledge graph, a more classical and effective embedding scheme can be applied, so that the quality of the embedded super-relationship knowledge graph can be ensured, and the expression capacity and the prediction effect of the embedded super-relationship knowledge graph are improved.

In step 130, a training sample set corresponding to the fact tuples is obtained based on the fact tuples corresponding to the facts.

In step 140, the characterization model is trained based on the training sample set, resulting in a trained characterization model.

In one embodiment, a plurality of fact tuples corresponding to the facts and a training sample set corresponding to the fact tuples may be obtained using the facts in the knowledge-graph obtained by expanding the to-be-embedded super-relational knowledge-graph. Further, the characterization model may be trained based on the training sample set to obtain a trained characterization model. The training representation model can be understood as taking the to-be-embedded super-relationship knowledge graph as input, so that the embedded super-relationship knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph can be obtained, and then the completion of the to-be-embedded super-relationship knowledge graph is completed. In this embodiment, the training sample set is obtained directly based on the fact information to be embedded into the super-relationship knowledge graph, and the training is performed on the representation model based on the training sample set, so that the cost for obtaining the training sample set can be saved, and the training efficiency is improved.

It should be noted that the characterization model may be understood as a characterization model developed based on the attributes.

In another embodiment, the representation model may also be trained using other training sample sets, i.e., training sample sets that are not derived from facts to be embedded in the super-relationship knowledge-graph, resulting in a trained representation model.

In step 150, the to-be-embedded super-relationship knowledge graph is input into the trained characterization model, and the embedded super-relationship knowledge graph output by the trained characterization model is obtained.

In one embodiment, the to-be-embedded super-relationship knowledge-graph may be input to a post-training characterization model to obtain the embedded super-relationship knowledge-graph. According to the method and the device, the fact that the representation vectors of the entity and the relation are obtained through learning the information contained in the input to-be-embedded super-relation knowledge graph is achieved, and then the completion of the whole to-be-embedded super-relation knowledge graph is completed, and the expression capacity and the prediction effect of the embedded super-relation knowledge graph are improved.

According to the super-relationship knowledge graph embedding method provided by the invention, the super-relationship knowledge graph to be embedded can be converted into the conventional knowledge graph corresponding to the super-relationship knowledge graph to be embedded through attribute unfolding operation. Because the embedding of the super-relationship knowledge graph based on the conventional knowledge graph can improve the quality of the embedded super-relationship knowledge graph, a training sample set is obtained based on a fact multi-group corresponding to the facts in the conventional knowledge graph, a trained characterization model is obtained based on training of the training sample set, and the super-relationship knowledge graph to be embedded is input into the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model. The invention improves the expression capability and the prediction effect of the embedded super-relationship knowledge graph.

The transformation of the to-be-embedded super-relational knowledge graph into a general knowledge graph can be realized by adopting Star-based Attribute Expansion (Star-based Attribute-aware Expansion) operation and cluster-based Attribute (Clique-based Attribute-aware Expansion) operation, which are respectively described below.

Fig. 2 is a schematic flow chart of converting a knowledge graph of a super-relationship to be embedded into a knowledge graph corresponding to the knowledge graph of the super-relationship to be embedded based on an attribute unfolding operation.

In an exemplary embodiment of the present invention, the attribute development operation may include a star attribute development operation. Wherein, the super relation knowledge graph can comprise a plurality of super relation facts, the super relation facts can comprise a core triplet and attribute-attribute value pairs, and the core triplet can comprise a head entity, a tail entity and a relation. As can be seen from fig. 2, the conversion of the knowledge-graph of the super-relationship to be embedded into the knowledge-graph corresponding to the knowledge-graph of the super-relationship to be embedded based on the attribute development operation may include steps 210 to 240, and each step will be described below.

In step 210, an intermediate entity and two intermediate relationships are introduced.

In one embodiment, as can be seen in conjunction with FIG. 3, for the kth superrelationship fact in the superrelationship knowledge graph (corresponding to the superrelationship knowledge graph to be embedded), it is assumed without loss of generality that it contains a core triplet (s, r, o) and 2 attribute-attribute value pair descriptions (a ₁,v₁) and (a ₂,v₂).

In yet another embodiment, as can be seen in conjunction with fig. 4, the super-relationship knowledge graph shown in fig. 3 may be subjected to a star attribute expansion process to obtain a structural schematic diagram in the general knowledge graph. In one example, for the kth superrelation fact ((s, r, o), { (a ₁,v₁),(a₂,v₂) }), the star attribute expansion introduces one intermediate entity b _k, 2 relations (corresponding intermediate relations) r _s and r _o. Wherein the intermediate relationship may be determined from the relationship of the introduced intermediate entity to the entity in the kth superrelationship fact.

In step 220, the intermediate entities are connected with the head entity and the tail entity, respectively, based on the intermediate relationship, by a star attribute expansion operation.

In step 230, the intermediate entity is connected with the attribute-value entity in the attribute-value pair based on the attribute relationship in the attribute-value pair.

In step 240, the head entity and the tail entity are connected based on the relationship in the core triplet, so as to jointly obtain a knowledge graph corresponding to the knowledge graph to be embedded with the super-relationship.

In one embodiment, continuing with the description of FIG. 4, intermediate entity b _k may be connected to head entity s and tail entity o via r _s and r _o, respectively, and intermediate entity b _k may be connected to attribute value entity v _i via attribute relationship a _i. In particular, to distinguish the semantics of the core triplet (corresponding (s, r, o)) from the set of attribute-attribute value descriptions, the head entity s and the tail entity o may be connected by a relationship in the core triplet. Thus, a superrelation fact k can be converted into a local sub-graph structure on a general knowledge graph.

In yet another embodiment, the super-relationship knowledge-graph (corresponding to the super-relationship knowledge-graph to be embedded) is developed by a star-like attribute development operationConversion to general knowledge graph/>The process of (2) may also be implemented by the following algorithm:

(1) Inputting super-relation knowledge graph Wherein the converted general knowledge graph can be expressed as/>And/>

(2) Counting the relations in all core triples in F ^H, and marking the relations as a relation set R ^pri;

(3) Defining new relationships R _s and R _o for each relationship R in the relationship set R ^pri, and expanding the relationship set r+.r+.rjq { R _s,r_o };

(4) For the k=1, 2, …, |fh|super-relationship facts in F ^H The following steps are performed:

(a) Defining an intermediate entity b _k, adding the intermediate entity b _k into an entity set E+.E { b _k };

(b)F←F∪{(s,r,o),(b_k,r_s,s),(b_k,r_o,o)}；

(c)

(5) Outputting the converted general knowledge graph

Fig. 5 is a second schematic flow chart of converting a knowledge graph of a super-relationship to be embedded into a knowledge graph corresponding to the knowledge graph of the super-relationship to be embedded based on an attribute unfolding operation.

In an exemplary embodiment of the present invention, the attribute expansion operation may include a blob attribute expansion operation. Wherein, the super relation knowledge graph can comprise a plurality of super relation facts, the super relation facts can comprise a core triplet and attribute-attribute value pairs, and the core triplet can comprise a head entity, a tail entity and a relation. As can be seen from fig. 5, the conversion of the knowledge-graph of the super-relationship to be embedded into the knowledge-graph corresponding to the knowledge-graph of the super-relationship to be embedded may include steps 510 to 540 based on the attribute development operation, and each step will be described below.

In step 510, four intermediate relationships are introduced, where the intermediate relationships may be determined from the relationships in the core triples, the intermediate relationships including a first intermediate relationship, a second intermediate relationship, a third intermediate relationship, and a fourth intermediate relationship.

In one embodiment, as can be seen from fig. 6, the super-relationship knowledge graph shown in fig. 3 may be subjected to a processing of unfolding the bulk attribute to obtain a structural schematic diagram in the general knowledge graph. In one example, for the kth superrelation fact ((s, r, o), { (a ₁,v₁),(a₂,v₂) }), the blob attribute expansion introduces four intermediate relationsAnd/>Wherein the intermediate relationship may be determined from relationships in the core triples, the intermediate relationship comprising a first intermediate relationship/>Second intermediate relation/>Third intermediate relation/>And fourth intermediate relation/>Wherein, the first intermediate relation/>Is used to describe the relationship between the head entity s and the attribute value entity v ₁ in the attribute-attribute value pair in the kth superrelationship fact; second intermediate relationIs used to describe the relationship between the head entity s and the attribute value entity v ₂ in the attribute-attribute value pair in the kth superrelationship fact; third intermediate relation/>Is used to describe the relationship between the tail entity o and the attribute value entity v ₁ in the attribute-attribute value pair in the kth superrelationship fact; fourth intermediate relation/>Is used to describe the relationship between the tail entity o and the attribute value entity v ₂ in the attribute-attribute value pair in the kth superrelationship fact.

In step 520, the header entity and the attribute value entity in the attribute-attribute value pair are connected based on the first intermediate relationship and the second intermediate relationship by a blob attribute expansion operation.

In step 530, the tail entity and the attribute value entity in the attribute-attribute value pair are connected based on the third intermediate relationship and the fourth intermediate relationship.

In step 540, the head entity and the tail entity are connected based on the relationship in the core triplet, and a knowledge graph corresponding to the super-relationship knowledge graph to be embedded is obtained together.

In one embodiment, continuing with the description of FIG. 6, head entity s and tail entity o may be connected by r; the header entity s and attribute value entities v ₁ and v ₂ may be passed through a first intermediate relationship, respectivelyAnd a second intermediate relation/>Connecting; tail entity o and attribute value entities v ₁ and v ₂ may be passed through a third intermediate relationship/>, respectivelyAnd a fourth intermediate relationAnd (5) performing connection. Thus, a superrelation fact k can be converted into a local sub-graph structure on a general knowledge graph.

In yet another embodiment, the super-relationship knowledge graph (corresponding to the super-relationship knowledge graph to be embedded) is developed by a blob attribute unfolding operationConversion to general knowledge graph/>The process of (2) may also be implemented by the following algorithm:

(3) Counting the relations in all attribute-attribute value description sets in the F ^H, and marking the relations as a relation set R ^att;

(4)R′←R′∪R^pri；

(5) Defining a new relationship for each relationship a in the set of relationships R ^att And/>Expanding a set of relationships

(6) For the k=1, 2, …, |fh|super-relationship facts in F ^H The following steps are performed:

(7) Outputting the converted general knowledge graph

In order to train the characterization model and perform effective completion processing on the fact map to be embedded with the super-relationship based on the characterization model after training, the characterization model can be trained through a training sample set. In an example, the training sample set may include a positive training sample set and a negative training sample set.

In one embodiment, based on the fact tuples corresponding to facts, deriving the training sample set corresponding to the fact tuples may be accomplished in the following manner: taking the fact multi-group corresponding to the fact as a positive training sample set corresponding to the fact multi-group; based on the fact tuples corresponding to the facts, the head entities and/or the tail entities in the fact tuples are replaced to obtain a negative training sample set corresponding to the fact tuples.

In one example, a fact tuple corresponding to a plurality of facts in the transformed knowledge-graph may be generatedEach of the hyper-relational fact data constructs a negative sample, and the entity at position 1 is replaced by the corresponding negative sample setSimilarly, entities at different locations (e.g., head and/or tail entities) may be replaced to construct corresponding negative sample sets. In this embodiment, the training sample set is obtained directly based on the fact information to be embedded into the super-relationship knowledge graph, and the training is performed on the representation model based on the training sample set, so that the cost for obtaining the training sample set can be saved, and the training efficiency is improved.

In order to further describe the super-relationship knowledge graph embedding method provided by the invention, a process of training the characterization model based on the training sample set and obtaining the trained characterization model will be described with reference to fig. 7.

In an exemplary embodiment of the present invention, the training sample set may include a training fact tuple corresponding to the fact tuple, and the training fact tuple may include a training entity and a training relationship. As can be seen in conjunction with fig. 7, training the characterization model based on the training sample set, the obtaining the trained characterization model may include steps 710 to 740, which will be described below.

In step 710, the training sample set is input to the characterization model, and based on the encoder in the characterization model, the training entities and training relationships in the training sample set are characterized and updated to obtain updated training entity characterization vectors and updated training relationship characterization vectors.

In one embodiment, the characterization model may be trained based on a training sample set derived from a knowledge-graph corresponding to the knowledge-graph to be embedded in the super-relationship. In an example, the transformed knowledge-graph may be used as input to obtain representations (which may correspond to updated training entity representation vectors and updated training relationship representation vectors) of entities and relationships on the knowledge-graph by encoders based on the representation model.

It should be noted that, in the application process, the low-dimensional vector control embedding can be performed on all entities and relations to be embedded in the super-relation knowledge graph, where each entity and relation can be represented by a random initial vector. Further, the encoder model parameters may be initialized based on the dimensions of the embedded vector and the pre-specified super parameters.

In one embodiment, for a knowledge graph developed by shape attributes, characterization of entities and relationships on the knowledge graph (which may correspond to an updated training entity characterization vector and an updated training relationship characterization vector) may be achieved by:

(1) Token initialization, for intermediate entity b, the relationship mapping of the core triples from entity b to the superrelationship facts it involves, token initialization is Make/>Where d is the token vector dimension, e _ψ(b) represents the token vector shared by intermediate entities of the same core triplet relationship, and e _b is the token vector independent of the intermediate entities. For entity t which is not an intermediate entity, initializing to an independent characterization vector/>

(2) Message calculation: for the triplet (u, r, t), the message for the target entity t is calculated asWherein/>The corresponding entities and relationships are rolled up at the layer i graph by a token vector, MSG representing a message calculation function which, in one example,

(3) Message aggregation: carrying out additive aggregation SUM on the messages in the vicinity of the target entity t, wherein the expression is shown in a formula (1):

Wherein, Representing a set of neighbor entities for entity t with respect to relationship r.

(4) Characterization update: characterization of binding entities at the previous layerAnd updating the entity representation by a nonlinear update function UPD (in the invention, a fully connected layer) to obtain a layer 1 representation, wherein the expression is shown as a formula (2):

in particular, the relationship characterization for each layer may be updated by linear mapping, as shown in equation (3):

based on the above, the embodiment realizes the representation update (corresponding to the updated training entity representation vector and the updated training relation representation vector) of the entities and the relations on the input knowledge graph, and further lays a foundation for training the representation model.

It should be noted that, for the development of the blob attribute, the present invention does not introduce intermediate entities, so the token initialization in step (1) randomly initializes all entities and the relationship token vector.

In step 720, the updated training entity characterization vector and the updated training relationship characterization vector are input to a decoder, which is a decoder based on a scoring function to be embedded with the super-relationship knowledge graph, to obtain a fact score corresponding to the training fact tuple.

In one embodiment, the determination of the decoder model parameters may also be implemented in accordance with the encoder model parameter determination process. In the application process, after the encoder obtains the characterization vector of the entity and the relationship contained in the super-relationship knowledge graph to be embedded, in this embodiment, the decoder may be designed based on the scoring function of the existing super-relationship knowledge graph (for example, the super-relationship knowledge graph to be embedded), so as to measure the probability that the entity and the relationship form the super-relationship fact.

In one example, for the super-relationship factsFor an encoder composed of a graph-convolution network of L layers, the entity and the relation characterization vector input to the decoder are recorded asThe encoder scoring function based on the existing scoring function n-DistMult can be expressed as equation (4):

Wherein < · > represents a multiple linear product operation, satisfying < h ₁,h₂,…,h_n>＝∑_ih₁[i]·h₂[i]…h_n [ i ], function Representing pooling operations on the relationship and attribute vectors, such as bitwise summation/averaging, etc.

It should be noted that, the super-relationship knowledge graph embedding scheme based on attribute expansion provided in this embodiment is not limited to the above-described decoder structure of the scoring function, and any scoring function of the super-relationship knowledge graph may be used as the decoder portion in this embodiment.

In step 730, a loss function is derived based on the fact score.

In one embodiment, the training fact tuples may include a positive training fact tuple corresponding to the positive training sample set and a negative training fact tuple corresponding to the negative training sample set, and the fact score may include a positive training fact score corresponding to the positive training fact tuple and a negative training fact score corresponding to the negative training fact tuple, wherein the deriving the loss function may be determined using equation (5) based on the fact score:

Wherein, Representing a loss function; phi (x) represents the positive training fact score; phi (x') represents a negative training fact score; /(I)A negative training sample set representing a fact tuple x for an ith position construct; x' represents a negative training fact tuple.

In step 740, the characterization model is updated with parameters based on the loss function, resulting in a trained characterization model.

In one embodiment, the updating of the parameters of the characterization model may be accomplished when the convergence condition is reached. Further, a trained characterization model can be obtained based on parameter updating, and an embedded super-relationship knowledge graph corresponding to the super-relationship knowledge graph to be embedded is obtained based on the trained characterization model, so that the super-relationship knowledge graph is completed. The convergence condition may be that the performance of the training model on the verification set is continuous for several rounds of iteration without rising or reaching the maximum iteration number. In an example, a threshold may be set according to a trained model, and the super-relationship knowledge graph embedding method in the present invention calculates a score for a given super-relationship fact, and if the score is higher than the threshold, the given fact is considered to be true, and the score is positively correlated with the likelihood that the fact is true.

It should be noted that, the problem of complementing the super-relationship knowledge graph can be simplified into a link prediction problem, which is expressed as follows: knowledge graph with missing super-relationship to be embedded is knownThe entity set involved is denoted E, the relationship set is denoted R, where the known superrelationship facts are denoted set/>Where i represents the ith fact, s _i∈E,o_i∈E,v_ij∈E,r_i∈R,a_ij ε R. The link prediction task needs to infer from the observed facts the missing facts to be embedded in the hyper-relational knowledge-graph, such as predicting the objective (tail) entity of the missing core triplet in ((s, r,.

In order to further describe the super-relationship knowledge graph embedding method provided by the invention, the following description will be made with reference to fig. 8.

FIG. 8 is a second flowchart of the method for embedding a super-relational knowledge graph provided by the present invention.

In an exemplary embodiment of the present invention, as can be seen in fig. 8, the super-relationship knowledge graph embedding method may include steps 810 to 860, and each step will be described below.

In step 810, the knowledge-graph of the super-relationship to be embedded and the initialization parameters are input.

In one embodiment, wikiPeople datasets may be illustrated as examples of knowledge maps of superrelationships to be embedded. Wherein WikiPeople dataset includes entities (47765), relationships (707), attribute-attribute value description set elements (2-9), training set entries (305725), validation set entries (38223), test set entries (38281). The process of learning the super-relationship knowledge graph embedding and completing the link prediction is described below by taking WikiPeople as an example.

In an example, the input knowledge graph to be embedded with the super-relationship may be initialized, that is, a low-dimensional vector is randomly generated for 47765 entities in the dataset, and a low-dimensional vector is randomly generated for each relationship with a length d and 707, and the length d is the length d. Further, model parameters of the encoder and decoder may be initialized based on the initially entered correlation parameters.

In step 820, the knowledge-graph of the super-relationship to be embedded is converted into a knowledge-graph corresponding to the knowledge-graph of the super-relationship to be embedded based on the attribute expansion operation.

In one embodiment, the super-relationship knowledge-graph to be embedded may beConversion to a generic knowledge graph/>, by attribute expansion operationsThe following describes a star attribute expansion operation as an example. /(I)

(1) Initializing a transformed general knowledge graphAnd/>

The definition of the intermediate entity b _k, adding it to the entity set E≡Eu { b _k }

F←F∪{(s,r,o),(b_k,r_s,s),(b_k,r_o,o)}；

(5) And outputting the converted knowledge graph= (E, R, F).

In step 830, message propagation and characterization updates are performed on the knowledge-graph based on the encoder.

It should be noted that the encoder may be an encoder based on a multi-relation graph convolutional network.

In one embodiment, the training set may be305725 Records in (1) are equally divided into a plurality of batches, and each batch of data is recorded as/>1 Hour of data was taken in sequence for training. Can be aimed at(Each hyper-relational fact data in the transformed knowledge-graph may be mapped to a fact tuple) to construct a negative sample, replacing the entity at position 1 with the corresponding negative sample set/> (May correspond to a negative training sample set). And so on, replacing the entities (head and/or tail entities) in different positions constructs the corresponding negative sample set.

In yet another example, one canInputting the obtained entity and relation characterization matrix into an encoder structure based on a graph convolution network to obtain updated entity and relation characterization vectors, namely/>Where Enc represents the encoder function, any graph rolling network structure may be employed.

In step 840, the updated entity and relationship characterization vector are input to a decoder based on the scoring function to be embedded in the super-relationship knowledge-graph, and the score for each super-relationship fact is calculated.

In step 850, a loss function is derived based on the scores of the superrelationship facts, and a characterization model is trained based on the loss function.

In one embodiment, this may be based onSample/>From the slaveSelecting corresponding entity and relation characterization vector, inputting the selected entity and relation characterization vector into a decoder structure, and calculating the score/>, of corresponding super-relation factsThen, calculating a loss function shown in a formula (6), and carrying out parameter updating on the characterization vector and the model parameters by adopting a gradient descent algorithm:

During application, steps 820 through 850 may be repeated to complete 1 iteration over the entire training set, completing the link prediction task at the validation set, calculating an average reciprocal rank (Mean Reciprocal Rank, also known as MRR), where a larger MRR indicates a higher accuracy of the link prediction the higher the correct missing entity rank. In particular, positive samples, FILTER MRR, of the data that have been observed in addition to the test facts may be filtered out when calculating the MRR index. If the MRR index on the validation set does not rise for a given number of iterations in succession or the total number of iterations reaches a given limit, training is stopped. Otherwise, the steps are repeated continuously.

In step 860, the knowledge graph of the super relationship to be embedded is input to the trained characterization model to complement the knowledge graph of the super relationship to be embedded.

In one embodiment, the knowledge-graph of the super-relationship to be embedded may be complemented based on the post-training characterization model. In the application process, the link prediction task can be completed on the test set. The comparison of the link prediction results of the knowledge graph to be embedded with the super-relationship can be shown in table 1.

TABLE 1 to be embedded super relational knowledge graph linking prediction result comparison table

The meaning of the data in table 1 is used to characterize the predictive performance index, where a larger data indicates a better predictive performance. As can be found from Table 1, the attribute expansion-based super-relationship knowledge graph embedding method provided by the invention is obviously improved compared with the baseline method.

According to the description, the super-relationship knowledge graph embedding method provided by the invention can convert the super-relationship knowledge graph to be embedded into the conventional knowledge graph corresponding to the super-relationship knowledge graph to be embedded through attribute unfolding operation. Because the embedding of the super-relationship knowledge graph based on the conventional knowledge graph can improve the quality of the embedded super-relationship knowledge graph, a training sample set is obtained from a fact multi-group corresponding to the facts in the conventional knowledge graph, a trained characterization model is obtained based on the training sample set, and the super-relationship knowledge graph to be embedded is input into the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model. The invention improves the expression capability and the prediction effect of the embedded super-relationship knowledge graph.

Based on the same conception, the invention also provides a super-relationship knowledge graph embedding device.

The super-relationship knowledge graph embedding device provided by the invention is described below, and the super-relationship knowledge graph embedding device described below and the super-relationship knowledge graph embedding method described above can be correspondingly referred to each other.

Fig. 9 is a schematic structural diagram of the super-relationship knowledge graph embedding device provided by the invention.

In an exemplary embodiment of the present invention, as can be seen in fig. 9, the super-relationship knowledge graph embedding apparatus may include an obtaining module 910, a converting module 920, a processing module 930, a training module 940, and a generating module 950, where each module will be described below.

The acquisition module 910 may be configured to acquire a knowledge-graph of the super-relationship to be embedded.

The transformation module 920 may be configured to transform the to-be-embedded superrelationship knowledge-graph into a knowledge-graph corresponding to the to-be-embedded superrelationship knowledge-graph based on the attribute expansion operation, where the knowledge-graph may include a plurality of facts therein.

The processing module 930 may be configured to derive a training sample set corresponding to the fact tuple based on the fact tuple corresponding to the fact.

The training module 940 may be configured to train the characterization model based on the training sample set, resulting in a trained characterization model.

The generating module 950 may be configured to input the to-be-embedded super-relationship knowledge-graph to the post-training characterization model, resulting in an embedded super-relationship knowledge-graph output by the post-training characterization model.

In an exemplary embodiment of the present invention, the attribute expansion operation may include a star-like attribute expansion operation, and the super relationship knowledge graph may include a plurality of super relationship facts, where the super relationship facts may include a core triplet and attribute-attribute value pairs, and the core triplet may include a head entity, a tail entity, and a relationship. The conversion module 920 may convert the knowledge-graph of the super-relationship to be embedded into a knowledge-graph corresponding to the knowledge-graph of the super-relationship to be embedded based on the attribute expansion operation in the following manner: introducing an intermediate entity and two intermediate relations; connecting the intermediate entity with the head entity and the tail entity respectively based on the intermediate relation through star attribute unfolding operation; based on the attribute relation in the attribute-attribute value pair, connecting the intermediate entity with the attribute value entity in the attribute-attribute value pair, and connecting the head entity with the tail entity based on the relation in the core triplet, so as to jointly obtain a knowledge graph corresponding to the knowledge graph to be embedded with the super-relation.

In an exemplary embodiment of the present invention, the attribute expansion operation may include a star-like attribute expansion operation, and the super relationship knowledge graph may include a plurality of super relationship facts, where the super relationship facts may include a core triplet and attribute-attribute value pairs, and the core triplet may include a head entity, a tail entity, and a relationship. The conversion module 920 may convert the knowledge-graph of the super-relationship to be embedded into a knowledge-graph corresponding to the knowledge-graph of the super-relationship to be embedded based on the attribute expansion operation in the following manner: introducing four intermediate relations, wherein the intermediate relations are determined according to the relations in the core triples, and the intermediate relations comprise a first intermediate relation, a second intermediate relation, a third intermediate relation and a fourth intermediate relation; connecting the head entity with the attribute value entity in the attribute-attribute value pair based on the first intermediate relationship and the second intermediate relationship through the bulk attribute unfolding operation; and connecting the tail entity with the attribute value entity in the attribute-attribute value pair based on the third intermediate relation and the fourth intermediate relation, and connecting the head entity with the tail entity based on the relation in the core triplet, so as to jointly obtain a knowledge graph corresponding to the knowledge graph of the super relation to be embedded.

In an exemplary embodiment of the present invention, the training sample set may include a positive training sample set and a negative training sample set; the processing module 930 may derive a training sample set corresponding to the fact tuples based on the fact tuples corresponding to the facts in the following manner: taking the fact multi-group corresponding to the fact as a positive training sample set corresponding to the fact multi-group; based on the fact tuples corresponding to the facts, the head entities and/or the tail entities in the fact tuples are replaced to obtain a negative training sample set corresponding to the fact tuples.

In an exemplary embodiment of the present invention, the training sample set may include a training fact tuple corresponding to the fact tuple, and the training fact tuple may include a training entity and a training relationship; the training module 940 may train the characterization model based on the training sample set in the following manner to obtain a trained characterization model: inputting the training sample set into a characterization model, and performing characterization update on training entities and training relations in the training sample set based on an encoder in the characterization model to obtain updated training entity characterization vectors and updated training relation characterization vectors; inputting the updated training entity characterization vector and the updated training relation characterization vector to a decoder to obtain a fact score corresponding to the training fact multi-tuple, wherein the decoder is a decoder based on a scoring function of the to-be-embedded super-relation knowledge graph; obtaining a loss function based on the fact score; and updating parameters of the characterization model based on the loss function to obtain the trained characterization model.

In an exemplary embodiment of the present invention, the training fact tuples may include positive training fact tuples corresponding to positive training sample sets and negative training fact tuples corresponding to negative training sample sets, and the fact scores may include positive training fact scores corresponding to the positive training fact tuples and negative training fact scores corresponding to the negative training fact tuples; the training module 940 may derive a loss function based on the fact score using equation (7):

Fig. 10 illustrates a physical structure diagram of an electronic device, as shown in fig. 10, which may include: processor 1010, communication interface (Communications Interface) 1020, memory 1030, and communication bus 1040, wherein processor 1010, communication interface 1020, and memory 1030 communicate with each other via communication bus 1040. Processor 1010 may invoke logic instructions in memory 1030 to perform a super-relationship knowledge-graph embedding method comprising: acquiring a knowledge graph of the super relation to be embedded; converting the knowledge graph to be embedded into a knowledge graph corresponding to the knowledge graph to be embedded based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-group based on the fact multi-group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the to-be-embedded super-relationship knowledge graph into the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model.

Further, the logic instructions in the memory 1030 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer can execute a super-relationship knowledge graph embedding method provided by the above methods, and the method includes: acquiring a knowledge graph of the super relation to be embedded; converting the knowledge graph to be embedded into a knowledge graph corresponding to the knowledge graph to be embedded based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-group based on the fact multi-group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the to-be-embedded super-relationship knowledge graph into the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model.

In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the above-described method of embedding a super-relationship knowledge graph provided by the methods, the method comprising: acquiring a knowledge graph of the super relation to be embedded; converting the knowledge graph to be embedded into a knowledge graph corresponding to the knowledge graph to be embedded based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts; obtaining a training sample set corresponding to the fact multi-group based on the fact multi-group corresponding to the fact; training the characterization model based on the training sample set to obtain a trained characterization model; and inputting the to-be-embedded super-relationship knowledge graph into the trained characterization model to obtain the embedded super-relationship knowledge graph output by the trained characterization model.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

It will further be appreciated that although operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The method is characterized by being applied to the completion of missing relations in questions to be asked and answered in a search question and answer scene, and comprises the following steps:

acquiring a knowledge graph of the super relation to be embedded;

converting the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts corresponding to keywords in questions to be asked;

obtaining a training sample set corresponding to a fact multi-group based on the fact multi-group corresponding to the fact;

Training the characterization model based on the training sample set to obtain a trained characterization model;

Inputting the to-be-embedded super-relationship knowledge graph to the training characterization model to obtain an embedded super-relationship knowledge graph output by the training characterization model, wherein the attribute expansion operation comprises a star attribute expansion operation, the super-relationship knowledge graph comprises a plurality of super-relationship facts, the super-relationship facts comprise a core triplet and attribute-attribute value pairs, and the core triplet comprises a head entity, a tail entity and a relationship;

The attribute-based unfolding operation converts the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph, and specifically comprises the following steps:

introducing an intermediate entity and two intermediate relations;

Connecting the intermediate entity with the head entity and the tail entity based on the intermediate relation through the star attribute unfolding operation;

connecting the intermediate entity with an attribute value entity in the attribute-attribute value pair based on the attribute relationship in the attribute-attribute value pair, and

Connecting the head entity and the tail entity based on the relation in the core triplet to jointly obtain a knowledge graph corresponding to the to-be-embedded super-relation knowledge graph, or

The attribute unfolding operation comprises a bulk attribute unfolding operation, the super-relationship knowledge graph comprises a plurality of super-relationship facts, the super-relationship facts comprise a core triplet and attribute-attribute value pairs, and the core triplet comprises a head entity, a tail entity and a relationship;

Introducing four intermediate relations, wherein the intermediate relations are determined according to the relations in the core triples, and the intermediate relations comprise a first intermediate relation, a second intermediate relation, a third intermediate relation and a fourth intermediate relation;

Connecting the head entity and the attribute value entity in the attribute-attribute value pair based on the first intermediate relationship and the second intermediate relationship through the bulk attribute expansion operation;

connecting the tail entity and the attribute value entity in the attribute-attribute value pair based on the third intermediate relationship and the fourth intermediate relationship, and

And connecting the head entity and the tail entity based on the relation in the core triplet to jointly obtain a knowledge graph corresponding to the to-be-embedded super-relation knowledge graph.

2. The method for embedding a super-relationship knowledge-graph according to claim 1, wherein the training sample set comprises a positive training sample set and a negative training sample set;

The training sample set corresponding to the fact multi-group is obtained based on the fact multi-group corresponding to the fact, and specifically comprises the following steps:

taking the fact multi-group corresponding to the fact as the positive training sample set corresponding to the fact multi-group;

And replacing a head entity and/or a tail entity in the fact multi-group based on the fact multi-group corresponding to the fact, so as to obtain the negative training sample set corresponding to the fact multi-group.

3. The method of claim 1, wherein the training sample set includes a training fact tuple corresponding to the fact tuple, the training fact tuple including a training entity and a training relationship;

training the characterization model based on the training sample set to obtain a trained characterization model, which specifically comprises the following steps:

Inputting the training sample set into the characterization model, and performing characterization update on the training entities and the training relations in the training sample set based on an encoder in the characterization model to obtain updated training entity characterization vectors and updated training relation characterization vectors;

Inputting the updated training entity characterization vector and the updated training relation characterization vector to a decoder to obtain a fact score corresponding to the training fact multi-group, wherein the decoder is a decoder based on a scoring function of the to-be-embedded super-relation knowledge graph;

obtaining a loss function based on the fact score;

and updating parameters of the characterization model based on the loss function to obtain the trained characterization model.

4. The method of claim 3, wherein the training fact tuples comprise positive training fact tuples corresponding to a positive training sample set and negative training fact tuples corresponding to a negative training sample set, the fact score comprising a positive training fact score corresponding to the positive training fact tuple and a negative training fact score corresponding to the negative training fact tuple;

Based on the fact score, the resulting loss function is determined using the following formula:

；

Wherein, Representing the loss function; /(I)Representing the positive training fact score; /(I)Representing the negative training fact score; /(I)Representing fact tuples/>A negative training sample set constructed for the ith location; /(I)Representing the negative training fact tuples.

5. The utility model provides a super relation knowledge map embedding device, its characterized in that, the device is applied to the relation that lacks in the pending question of completion search question and answer scene, and the device includes:

the acquisition module is used for acquiring a to-be-embedded super-relationship knowledge graph;

The transformation module is used for transforming the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph based on attribute unfolding operation, wherein the knowledge graph comprises a plurality of facts corresponding to keywords in questions to be asked;

the processing module is used for obtaining a training sample set corresponding to the fact multi-group based on the fact multi-group corresponding to the fact;

the training module is used for training the characterization model based on the training sample set to obtain a trained characterization model;

The generation module is used for inputting the to-be-embedded super-relationship knowledge graph to the training characterization model to obtain an embedded super-relationship knowledge graph output by the training characterization model, wherein the attribute expansion operation comprises a star attribute expansion operation, the super-relationship knowledge graph comprises a plurality of super-relationship facts, the super-relationship facts comprise a core triplet and attribute-attribute value pairs, and the core triplet comprises a head entity, a tail entity and a relationship;

the conversion module realizes attribute-based unfolding operation by adopting the following modes to convert the to-be-embedded super-relationship knowledge graph into a knowledge graph corresponding to the to-be-embedded super-relationship knowledge graph:

introducing an intermediate entity and two intermediate relations;

6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the super-relationship knowledge-graph embedding method of any one of claims 1 to 4 when the program is executed by the processor.

7. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the super-relationship knowledge-graph embedding method of any of claims 1 to 4.

8. A computer program product comprising a computer program which, when executed by a processor, implements the super-relationship knowledge-graph embedding method of any one of claims 1 to 4.