CN117951308A - Zero sample knowledge graph completion method and device - Google Patents

Zero sample knowledge graph completion method and device Download PDF

Info

Publication number
CN117951308A
CN117951308A CN202410134194.8A CN202410134194A CN117951308A CN 117951308 A CN117951308 A CN 117951308A CN 202410134194 A CN202410134194 A CN 202410134194A CN 117951308 A CN117951308 A CN 117951308A
Authority
CN
China
Prior art keywords
vector
triplet
feature vector
determining
embedded representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410134194.8A
Other languages
Chinese (zh)
Inventor
马金鸣
孙博
刘德平
董思聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202410134194.8A priority Critical patent/CN117951308A/en
Publication of CN117951308A publication Critical patent/CN117951308A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The application discloses a zero sample knowledge graph completion method and device, wherein the method comprises the following steps: and determining the structural characteristics of the knowledge graph corresponding to the text description based on the text description of the relation to be predicted. An embedded representation of the relationship to be predicted is then determined based on the textual description and the knowledge-graph structural features. And obtaining a first feature vector of the candidate entity pair, wherein the candidate entity pair and the relation to be predicted are elements forming a complement triplet. A complement triplet is determined based on the similarity between the first feature vector and the embedded representation. That is, the candidate entity pairs with higher similarity and the relationship to be predicted form a triplet, and the knowledge graph is complemented. By the zero sample knowledge graph completion method, the triplet completion can be performed based on the characteristics of the relationship to be predicted and the entity pair on the basis of not introducing body information, so that the accuracy of the completion knowledge graph can be improved while the privacy of a user is protected.

Description

Zero sample knowledge graph completion method and device
Technical Field
The application relates to the technical field of knowledge maps, in particular to a zero sample knowledge map completion method and device.
Background
The knowledge graph is a structured semantic knowledge base, and the basic constituent unit of the knowledge graph is a triplet of entity-relation-entity. Wherein the entities are objects or individuals in the real world, and the relationship is the relationship existing between the entities.
As the number of entities in the knowledge-graph increases, the integrity of the knowledge-graph decreases. In order to complement the missing relationship in the knowledge graph, researchers have proposed the concept of knowledge graph complementation. The current zero-sample knowledge graph completion method needs to enrich the semantic expression of the relationship to be predicted by introducing additional ontology information, for example, the ontology information can represent entity and category description information of the relationship and the like. Training a generating model based on the characteristics of the ontology information and the triplet characteristics, and then generating the relation embedding of the knowledge graph by utilizing the generating model according to the text description of the relation to be predicted and the ontology information.
However, in practical applications, the ontology information in many fields usually relates to personal privacy information, such as human resources field, and the collection of the ontology information is difficult, that is, the number of training samples is insufficient, so that the accuracy of the current knowledge graph completion method in these fields is low.
Disclosure of Invention
In view of this, the application provides a zero sample knowledge graph completion method and device, so as to improve the accuracy of the zero sample completion knowledge graph while protecting the privacy of users without introducing additional body information.
In a first aspect, the present application provides a zero-sample knowledge graph completion method, the method comprising:
Determining structural features of the knowledge graph based on the text description of the relation to be predicted;
Determining an embedded representation of the relationship to be predicted based on the textual description and the knowledge-graph structural features;
Acquiring a first feature vector of a candidate entity pair;
a complement triplet is determined based on the similarity between the first feature vector and the embedded representation, the triplet being comprised of a relationship to be predicted and a candidate entity pair.
In one possible implementation manner, the determining the knowledge-graph structural feature based on the text description of the relationship to be predicted includes:
Acquiring the structural features of the knowledge graph based on a structural coding unit; the structure coding unit is trained based on the label triplet vector, the positive sample triplet vector and the negative sample triplet vector.
In one possible implementation, the structure coding unit includes a linear layer and an activation function layer, and the training process of the structure coding unit includes:
Acquiring the label triplet vector, the positive sample triplet vector and the negative sample triplet vector;
inputting the tag triplet vector, the positive sample triplet vector and the negative sample triplet vector into the linear layer and the activation function layer to obtain tag triplet embedded representation, positive sample embedded representation and negative sample embedded representation;
Determining a loss function based on a euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, a euclidean distance between the tag triplet embedded representation and the negative sample embedded representation;
and adjusting the parameters of the linear layer by taking the value of the loss function as a target to train until the preset training times are reached, and obtaining the trained linear layer.
In one possible implementation manner, the candidate entity pair includes a head entity and a tail entity, and the obtaining a first feature vector of the candidate entity pair includes:
acquiring a head entity vector of the head entity and a tail entity vector of the tail entity;
Performing convolution processing on the head entity vector and the tail entity vector to determine an entity pair feature vector;
and processing the feature vector by the entity based on the linear mapping relation and the activation function, and determining the first feature vector.
In one possible implementation, the determining the complement triplet based on the similarity between the first feature vector and the embedded representation includes:
acquiring a second feature vector of a neighbor node of the candidate entity pair;
Determining an enhanced feature vector based on the first feature vector and the second feature vector;
the triples are determined based on a similarity between the enhanced feature vector and the embedded representation.
In a possible implementation manner, the obtaining the second feature vector of the neighboring node of the candidate entity pair includes:
acquiring neighbor node vectors of the neighbor nodes;
performing dot product attention calculation on the neighbor node vectors, and determining an attention coefficient;
The second feature vector is determined based on the attention coefficient.
In one possible implementation, the determining the second feature vector based on the attention coefficient includes:
determining a third feature vector based on the attention coefficient and the neighbor node vector;
And processing the third feature vector based on the linear mapping relation and the activation function to determine the second feature vector.
In a possible implementation manner, the determining the embedded representation of the relationship to be predicted based on the text description and the knowledge-graph structural feature includes:
acquiring text characteristics of the text description;
And carrying out aggregation processing on the text features and the knowledge graph structural features, and determining the embedded representation.
In one possible implementation manner, the obtaining the text feature of the text description includes:
Performing text embedding processing on the text description to obtain a text vector;
and extracting the characteristics of the text vector based on a long-short-term memory network LSTM to acquire the text characteristics.
In a second aspect, the present application provides a zero-sample knowledge-graph completion device, the device comprising:
the first determining unit is used for determining the structural characteristics of the knowledge graph based on the text description of the relation to be predicted;
A second determining unit, configured to determine an embedded representation of the relationship to be predicted based on the text description and the knowledge-graph structural feature;
An obtaining unit, configured to obtain a first feature vector of a candidate entity pair;
And a third determining unit, configured to determine a complement triplet based on a similarity between the first feature vector and the embedded representation, where the triplet is composed of a relationship to be predicted and a candidate entity pair.
In a possible implementation manner, the first determining unit is specifically configured to obtain the knowledge-graph structural feature based on the structural encoding unit; the structure coding unit is trained based on the label triplet vector, the positive sample triplet vector and the negative sample triplet vector.
In one possible implementation, the structure coding unit includes a linear layer and an activation function layer, and the training process of the structure coding unit includes: acquiring the label triplet vector, the positive sample triplet vector and the negative sample triplet vector; inputting the tag triplet vector, the positive sample triplet vector and the negative sample triplet vector into the linear layer and the activation function layer to obtain tag triplet embedded representation, positive sample embedded representation and negative sample embedded representation; determining a loss function based on a euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, a euclidean distance between the tag triplet embedded representation and the negative sample embedded representation; and adjusting the parameters of the linear layer by taking the value of the loss function as a target to train until the preset training times are reached, and obtaining the trained linear layer.
In a possible implementation manner, the candidate entity pair includes a head entity and a tail entity, and the acquiring unit is specifically configured to acquire a head entity vector of the head entity and a tail entity vector of the tail entity; performing convolution processing on the head entity vector and the tail entity vector to determine an entity pair feature vector; and processing the feature vector by the entity based on the linear mapping relation and the activation function, and determining the first feature vector.
In a possible implementation manner, the third determining unit is specifically configured to obtain a second feature vector of a neighboring node of the candidate entity pair; determining an enhanced feature vector based on the first feature vector and the second feature vector; the triples are determined based on a similarity between the enhanced feature vector and the embedded representation.
In a possible implementation manner, the third determining unit is specifically configured to obtain a neighboring node vector of the neighboring node; performing dot product attention calculation on the neighbor node vectors, and determining an attention coefficient; the second feature vector is determined based on the attention coefficient.
In a possible implementation manner, the third determining unit is specifically configured to determine a third feature vector based on the attention coefficient and the neighboring node vector; and processing the third feature vector based on the linear mapping relation and the activation function to determine the second feature vector.
In a possible implementation manner, the second determining unit is specifically configured to obtain a text feature of the text description; and carrying out aggregation processing on the text features and the knowledge graph structural features, and determining the embedded representation.
In a possible implementation manner, the second determining unit is specifically configured to perform text embedding processing on the text description to obtain a text vector; and extracting the characteristics of the text vector based on a long-short-term memory network LSTM to acquire the text characteristics.
In a third aspect, the present application provides a zero-sample knowledge-graph completion apparatus, the apparatus comprising: a memory and a processor;
the memory is used for storing related program codes;
The processor is configured to invoke the program code to execute the zero sample knowledge graph completion method according to any implementation manner of the first aspect.
In a fourth aspect, the present application provides a computer readable storage medium, where the computer readable storage medium is configured to store a computer program, where the computer program is configured to perform the zero sample knowledge graph completion method according to any implementation manner of the first aspect.
From this, the application has the following beneficial effects:
In the above implementation manner of the present application, the structural feature of the knowledge graph corresponding to the text description may be determined based on the text description of the relationship to be predicted. An embedded representation of the relationship to be predicted is then determined based on the textual description and the knowledge-graph structural features. And obtaining a first feature vector of the candidate entity pairs, wherein the candidate entity pairs and the relationships to be predicted are elements forming a complement triplet, and can be a plurality of candidate entity pairs and a plurality of relationships to be predicted. A complement triplet is determined based on the similarity between the first feature vector and the embedded representation. That is, the candidate entity pairs with higher similarity and the relationship to be predicted form a triplet, and the knowledge graph is complemented. By the zero sample knowledge graph completion method provided by the application, the triplet completion can be carried out based on the characteristics of the relation to be predicted and the entity pair on the basis of not introducing the body information, so that the accuracy of the completion knowledge graph can be improved while the privacy of the user is protected.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments provided in the present application, and other drawings may be obtained according to these drawings for those of ordinary skill in the art.
FIG. 1 is a flowchart of a knowledge graph completion method for a zero sample according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a hybrid attention unit according to an embodiment of the present application;
FIG. 3 is a flowchart of another zero sample knowledge graph completion method according to an embodiment of the present application;
fig. 4 is a schematic diagram of a zero-sample knowledge graph completion device according to an embodiment of the present application;
Fig. 5 is a schematic diagram of a zero-sample knowledge graph completion device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, where the described embodiments are merely exemplary implementations, but not all implementations of the application. Those skilled in the art can combine embodiments of the application to obtain other embodiments without inventive faculty, and such embodiments are also within the scope of the application.
The knowledge graph is a graph structure model for representing knowledge and information, and organizes entities and relations together to form a semantically interrelated knowledge network, and can be used for knowledge management, knowledge discovery and intelligent application in various fields.
As the number of entities in the knowledge-graph increases, the integrity of the knowledge-graph decreases. In order to complement the missing relationship in the knowledge graph, researchers have proposed the concept of knowledge graph complementation. The traditional knowledge graph completion method only can complete the existing relations in the knowledge graph, but cannot predict the relations which do not exist in the original knowledge graph. Therefore, researchers have proposed the concept of zero sample knowledge graph completion, i.e., without using existing triples as samples for knowledge graph completion.
The existing zero-sample knowledge graph completion method needs to enrich semantic expression of the relationship to be predicted by introducing additional ontology information, wherein the ontology information can represent entity and relationship category description information and the like. Training a generating model based on the characteristics of the ontology information and the triplet characteristics, and then generating the relation embedding of the knowledge graph by utilizing the generating model according to the text description of the relation to be predicted and the ontology information.
In practical applications, however, many areas of ontology information typically involve personal privacy information. For example, in the human resources field, ontology information is typically occupation, job position, income, family relationship, etc. of the user. The collection of the ontology information is difficult, that is, the number of training samples is insufficient, so that the accuracy of the current knowledge graph completion method in the fields is low.
Based on the above, the embodiment of the application provides a knowledge graph completion method, so that the accuracy of the completion knowledge graph is improved while protecting the privacy of a user on the basis of not introducing additional body information. In specific implementation, the structural features of the knowledge graph corresponding to the text description can be determined based on the text description of the relation to be predicted. An embedded representation of the relationship to be predicted is then determined based on the textual description and the knowledge-graph structural features. And obtaining a first feature vector of the candidate entity pairs, wherein the candidate entity pairs and the relationships to be predicted are elements forming a complement triplet, and can be a plurality of candidate entity pairs and a plurality of relationships to be predicted. A complement triplet is determined based on the similarity between the first feature vector and the embedded representation. That is, the candidate entity pairs with higher similarity and the relationship to be predicted form a triplet, and the knowledge graph is complemented. By the zero sample knowledge graph completion method provided by the application, the triplet completion can be carried out based on the characteristics of the relation to be predicted and the entity pair on the basis of not introducing the body information, so that the accuracy of the completion knowledge graph can be improved while the privacy of the user is protected.
In order to facilitate understanding of the technical solutions provided by the embodiments of the present application, the following description will specifically refer to the accompanying drawings in the specification.
Referring to fig. 1, fig. 1 is a flowchart of a zero-sample knowledge graph completion method according to an embodiment of the present application.
The method may be performed by a data processing device, where the data processing device may be an electronic device or another device, and the embodiment of the present application is not limited to this.
The method may comprise the steps of:
s101: and determining the structural characteristics of the knowledge graph based on the text description of the relation to be predicted.
The text description of the relationship to be predicted may be a description of the relationship to be predicted in natural language. For example, the textual description of the relationship to be predicted may be "reddish and bright wedding in the last year". And extracting the structural features of the knowledge graph corresponding to the relation to be predicted by processing the text description.
In one possible implementation manner, the text description of the relation to be predicted can be processed by using the trained structural coding unit, so as to obtain the structural feature of the knowledge graph of the text description. The structure coding unit is trained based on the label triplet vector, the positive sample triplet vector and the negative sample triplet vector.
Optionally, after obtaining the text description of the relationship to be predicted, text embedding processing may be performed on the text description to obtain a text vector corresponding to the text description. And then inputting the text vector into a structure coding unit, and determining the structural characteristics of the knowledge graph corresponding to the text vector. Text embedding is a common processing method in the field of natural language processing, and can map text data into a vector space with a fixed length and retain certain semantic information in original text.
In one possible implementation, the structural coding unit may consist of a linear layer and an activation function layer. The linear layer is used for performing linear mapping processing on the vector. For example, when the original vector is X, the linear mapping process on the vector X may be expressed as X' =ax+b, where a may represent a weight matrix and b may represent a bias vector. The activation function layer comprises an activation function, i.e. the vector is processed by the activation function, the main function of which is to provide a non-linear processing capability.
For example, a modified linear unit (RECTIFIED LINEAR unit, relu) may be utilized as the activation function.
Wherein relu functions can be expressed as follows:
Alternatively, in order to enable the structure encoding unit to extract more feature information of the text description, feature extraction may be performed on the text vector by using two linear layers and two activation functions to obtain the knowledge-graph structural feature.
In order to facilitate understanding of the working principle of the structural coding unit, a training procedure of the structural coding unit will be described first.
When training the structural coding unit, firstly, a sample of the structural coding unit is required to be obtained, namely, a label triplet vector, the positive sample triplet vector and the negative sample triplet vector are obtained. The label triplet vector can be used as a reference vector, has higher similarity with the positive sample triplet vector and has larger difference with the negative sample triplet vector. The positive sample triplet vector may include a head entity vector, a tail entity vector, and a relationship vector. The triples as positive samples have the correct relationship between the head entity and the tail entity, and the triples as negative samples can be obtained by randomly replacing the relationship in the triples as positive samples. The header entity vector may be obtained by performing text embedding processing on the header entity.
And after the tag triplet vector, the positive sample triplet vector and the negative sample triplet vector are input to the linear layer and the activation function layer, obtaining a tag triplet embedded representation, a positive sample embedded representation and a negative sample embedded representation. Then determining the Euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, the Euclidean distance between the tag triplet embedded representation and the negative sample embedded representation, and determining the loss function according to the Euclidean distance between the tag triplet embedded representation and the positive sample embedded representation and the Euclidean distance between the tag triplet embedded representation and the negative sample embedded representation. And (3) taking the value of the loss function reduction as a target, adjusting parameters of the linear layer to train, stopping training until the preset training times are reached, and obtaining the trained linear layer.
Alternatively, MARGIN RANKING Loss may be used as a Loss function, MARGIN RANKING Loss being determined from the Euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, and the Euclidean distance between the tag triplet embedded representation and the negative sample embedded representation. For example, MARGIN RANKING Loss may be represented by L, d1 represents the euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, d2 represents the euclidean distance between the tag triplet embedded representation and the negative sample embedded representation, and L may be represented as l=max (0, r+d1-d 2). Wherein r may be a preset boundary value and is greater than 0.
Because the similarity between the tag triplet vector and the positive sample triplet vector is higher, and the difference between the tag triplet vector and the negative sample triplet vector is larger, under the theoretical condition that the structure coding unit (the linear layer and the activation function layer) is trained, the obtained tag triplet embedded representation and the positive sample embedded representation should be similar, that is, the Euclidean distance between the tag triplet embedded representation and the positive sample embedded representation is smaller, the difference between the tag triplet embedded representation and the negative sample embedded representation should be larger, that is, the Euclidean distance between the tag triplet embedded representation and the negative sample triplet embedded representation is larger, and under the theoretical condition, L should be smaller. In the initial training stage, the structural coding unit does not have the capability of extracting the characteristics, and the difference between d1 and d2 is small, so that the value of L is r+d1-d2 and is larger than 0. In the training process, by adjusting the parameters of the linear layer, the structural coding unit has the capability of extracting the characteristics, and the difference between d1 and d2 is larger and larger, namely L is smaller and smaller, and gradually approaches 0. Therefore, the value of the loss function can be reduced, the parameters of the linear layer can be adjusted successively, and the cut-off condition of training can be up to the preset training times. And stopping training after the number of times of adjusting the parameters of the linear layer reaches the preset number of times, and obtaining the linear layer and the activation function layer at the moment to serve as a trained structural coding unit.
S102: an embedded representation of the relationship to be predicted is determined based on the textual description and the knowledge-graph structural features.
After determining the knowledge-graph structural features corresponding to the text description of the relationship to be predicted, an embedded representation of the relationship to be predicted may be determined based on the text description and the knowledge-graph structural features.
In one possible implementation, feature extraction may be performed on the text description first, to obtain text features corresponding to the text description. And then, carrying out aggregation treatment on the text features and the knowledge graph structural features, and determining embedded representation of the relationship to be predicted. The aggregation processing may include splicing text features and knowledge-graph structural features, and then performing linear mapping processing on the spliced features to obtain an embedded representation.
When extracting text features of a text description, this may be accomplished, for example, by using a long and short term memory network (long short termmemory, LSTM) to extract the text features. Specifically, text embedding processing may be performed on the text description first to obtain a text vector. And then, extracting the characteristics of the text vector by utilizing the LSTM to acquire the text characteristics. Semantic relationships between text description total words and words can be captured through LSTM, and more accurate and meaningful features can be extracted. The specific implementation process can be referred to the existing technology, and the implementation process is not described in the embodiment of the application.
In one possible implementation manner, the above process of determining the embedded representation may be implemented by a generating unit that generates the countermeasure network, that is, the generating unit may acquire the text features of the text description, and perform aggregation processing on the text features and the knowledge-graph structural features to determine the embedded representation.
Wherein the generation of the countermeasure network may comprise a generation unit for generating samples close to the real data from the input data, and a discrimination unit, typically a classifier, for discriminating whether the input is a real sample or a generated sample. When the countermeasure network is generated in training, the generating unit aims at generating samples as real as possible, the judging unit aims at accurately judging the real samples and the generated samples, and the final training result is that the generating unit can generate samples with spurious and spurious, and the judging unit cannot distinguish the generated samples from the real samples. After training to generate the antagonism network, the trained generating unit may be utilized to obtain an embedded representation of the relationship to be predicted.
S103: a first feature vector of the candidate entity pair is obtained.
In order to complement the knowledge graph, in addition to providing the relationship to be predicted, it is also necessary to provide candidate entity pairs to be complemented, i.e., a head entity and a tail entity. The candidate entity pair may be obtained by combining any two entities in the entity set.
In one possible implementation, the first feature vector of the candidate entity pair may be determined by obtaining a head entity vector of the head entity and a tail entity vector of the tail entity, convolving the head entity vector and the tail entity vector, and determining the feature vector of the entity pair. And processing the feature vector by the entity based on the linear mapping relation and the activation function to determine a first feature vector. The head entity vector may be obtained by performing text embedding processing on the head entity, and the tail entity vector may be obtained by performing text embedding processing on the tail entity.
Optionally, when determining the entity pair feature vector, the head entity vector and the tail entity vector may be first spliced to obtain the spliced feature vector, and then the spliced feature vector is subjected to convolution processing to determine the entity pair feature vector. The linear mapping relationship may be expressed as Y' =my+n, where Y represents an entity pair feature vector, m may represent a weight matrix, and n may represent a bias vector. The main function of the activation function is to provide non-linear processing capability. In addition, in order to extract more feature information of the candidate entity, a plurality of linear mapping relationships and/or a plurality of activation functions may be used for processing, which is not limited in the embodiment of the present application.
Alternatively, the convolution layer, the linear layer, and the activation function layer may be formed into an entity signature extractor for determining a first signature vector of the candidate entity pair, where the convolution layer may include a convolution kernel for a convolution process. The linear layer comprises a linear mapping relation, and the activation function layer comprises an activation function. In addition, the number of the linear layer and the activation function layer may be plural.
It should be noted that, the execution sequence of steps S101-S102 and step S103 is not limited in the embodiment of the present application, i.e., steps S101-S102 may be executed first, and then step S103 may be executed, steps S101-S102 may be executed first, or steps S101 and S103 may be executed simultaneously.
S104: a complement triplet is determined based on the similarity between the first feature vector and the embedded representation.
After determining the first feature vector of the candidate entity pair and the embedded representation of the relationship to be predicted, a similarity between the first feature vector and the embedded representation may be calculated. The similarity can then be compared to a preset value, which can be used as a benchmark for screening the constituent triples. When the similarity is larger than a preset value, the candidate entity pair is closer to the relation to be predicted, and the candidate entity pair and the relation to be predicted can be formed into a triplet to complement the knowledge graph. When the candidate entity pair and the relationship to be predicted are acquired, a plurality of candidate entity pairs and the relationship to be predicted can be acquired to obtain a plurality of similarities, target similarities which are larger than a preset value in the plurality of similarities are acquired, and the triples are determined according to the candidate entity pair and the relationship to be predicted corresponding to the target similarities.
Since the purpose of completing the knowledge graph is to predict the relationship not existing in the original knowledge graph, in order to improve the accuracy of completing the triples, the global features between the entity and the neighbor nodes can be mined, i.e. the features of the neighbor nodes are utilized to enhance the features of the extracted entity. In one possible implementation, a second feature vector of a neighbor node of the candidate entity pair may be obtained. The neighbor nodes may include a neighbor node of a head entity and a neighbor node of a tail entity. An enhanced feature vector is determined based on the first feature vector and the second feature vector. For example, the first feature vector and the second feature vector may be spliced and then subjected to linear mapping processing as the enhancement feature vector. A triplet is then determined based on the similarity between the enhanced feature vector and the embedded representation.
Alternatively, the second feature vector of the neighbor node may be acquired by first acquiring the neighbor node vector of the neighbor node. The neighbor node vector may be obtained by performing text embedding processing on the neighbor node. And then performing dot product attention calculation on the neighbor node vectors, and determining the attention coefficients of the neighbor node vectors, thereby determining the second feature vector based on the attention coefficients.
The dot product attention calculation is performed on the neighbor node vector, and can be the processing of a self-attention mechanism. Specifically, the dot product can be firstly obtained by using the neighbor node vector and the transpose thereof, and then the weighted vector can be obtained as the attention coefficient by using the dot product of the weight matrix and the neighbor node vector. A second feature vector is then determined based on the attention coefficient. For example, the linear mapping process and the activation function process may be performed on the attention coefficient to obtain the second feature vector.
Optionally, in order to avoid that the element existing in the weight matrix is too small and even is 0, the attention coefficient and the neighbor node vector can be summed to obtain a third feature vector, so that the situation that the weight matrix is too small can be avoided, more semantic information of the neighbor node can be reserved, and the extracted feature (attention coefficient) is more accurate. The third feature vector is then processed based on the linear mapping and the activation function to determine a second feature vector.
In one possible implementation, the process of determining the second feature vector of the neighboring node described above may be implemented by using the dot product attention layer, the linear layer, and the activation function layer to form the neighboring node feature extractor. The number of linear layers and activation function layers may be one or more. In addition, the entity feature extractor and the neighbor node feature extractor in the above embodiments may be configured to form a mixed attention unit, which is used to process the candidate entity pair and the neighbor node to obtain the enhanced feature vector. Referring specifically to fig. 2, a schematic diagram of a hybrid attention unit according to an embodiment of the present application is shown.
By the knowledge graph completion method provided by the application, the triplet completion can be performed based on the characteristics of the relation to be predicted and the entity pair on the basis of not introducing the body information, so that the accuracy of the completion knowledge graph can be improved while the privacy of the user is protected.
Based on the method embodiment, the embodiment of the application also provides a flow chart of another knowledge graph completion method. Referring to fig. 3, fig. 3 is a flowchart of another zero-sample knowledge graph completion method according to an embodiment of the present application.
The method can be divided into two branches, wherein the first branch comprises steps A1-A3, the second branch comprises steps B1-B, and the execution processes of the two branches are not affected.
Wherein, the steps A1-A3 can comprise:
A1: acquiring a candidate entity pair and neighbor nodes of the candidate entity pair;
a2: determining a first feature vector of the candidate entity pair and a second feature vector of the neighbor node;
a3: an enhanced feature vector is determined based on the first feature vector and the second feature vector.
Steps B1-B3 may include:
B1: acquiring text description of a relation to be predicted;
b2: determining the structural characteristics of the knowledge graph of the text description;
b3: an embedded representation of the relationship to be predicted is determined based on the textual description and the knowledge-graph structural features.
After the two branches are performed, step C1 is finally performed.
C1: based on the similarity between the enhanced feature vector and the embedded representation, a triplet is determined.
Based on the method embodiment, the embodiment of the application also provides a zero sample knowledge graph completion device. Referring to fig. 4, fig. 4 is a schematic diagram of a zero-sample knowledge graph completing device according to an embodiment of the present application.
The apparatus 400 may include:
A first determining unit 401, configured to determine a knowledge-graph structural feature based on a text description of a relationship to be predicted;
a second determining unit 402, configured to determine an embedded representation of the relationship to be predicted based on the text description and the knowledge-graph structural feature;
an obtaining unit 403, configured to obtain a first feature vector of the candidate entity pair;
A third determining unit 404 is configured to determine a completed triplet, based on the similarity between the first feature vector and the embedded representation, the triplet being composed of a relationship to be predicted and a candidate entity pair.
In a possible implementation manner, the first determining unit 401 is specifically configured to obtain the knowledge-graph structural feature based on a structural encoding unit; the structure coding unit is trained based on the label triplet vector, the positive sample triplet vector and the negative sample triplet vector.
In one possible implementation, the structure coding unit includes a linear layer and an activation function layer, and the training process of the structure coding unit includes: acquiring the label triplet vector, the positive sample triplet vector and the negative sample triplet vector; inputting the tag triplet vector, the positive sample triplet vector and the negative sample triplet vector into the linear layer and the activation function layer to obtain tag triplet embedded representation, positive sample embedded representation and negative sample embedded representation; determining a loss function based on a euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, a euclidean distance between the tag triplet embedded representation and the negative sample embedded representation; and adjusting the parameters of the linear layer by taking the value of the loss function as a target to train until the preset training times are reached, and obtaining the trained linear layer.
In a possible implementation manner, the candidate entity pair includes a head entity and a tail entity, and the obtaining unit 403 is specifically configured to obtain a head entity vector of the head entity and a tail entity vector of the tail entity; performing convolution processing on the head entity vector and the tail entity vector to determine an entity pair feature vector; and processing the feature vector by the entity based on the linear mapping relation and the activation function, and determining the first feature vector.
In a possible implementation manner, the third determining unit 404 is specifically configured to obtain a second feature vector of a neighboring node of the candidate entity pair; determining an enhanced feature vector based on the first feature vector and the second feature vector; the triples are determined based on a similarity between the enhanced feature vector and the embedded representation.
In a possible implementation manner, the third determining unit 404 is specifically configured to obtain a neighboring node vector of the neighboring node; performing dot product attention calculation on the neighbor node vectors, and determining an attention coefficient; the second feature vector is determined based on the attention coefficient.
In a possible implementation manner, the third determining unit 404 is specifically configured to determine a third feature vector based on the attention coefficient and the neighboring node vector; and processing the third feature vector based on the linear mapping relation and the activation function to determine the second feature vector.
In a possible implementation manner, the second determining unit 402 is specifically configured to obtain a text feature of the text description; and carrying out aggregation processing on the text features and the knowledge graph structural features, and determining the embedded representation.
In a possible implementation manner, the second determining unit 402 is specifically configured to perform text embedding processing on the text description to obtain a text vector; and extracting the characteristics of the text vector based on a long-short-term memory network LSTM to acquire the text characteristics.
Based on the method embodiment and the device embodiment, the embodiment of the application also provides knowledge graph completion equipment. The following description will be made with reference to the accompanying drawings.
Referring to fig. 5, fig. 5 is a schematic diagram of a zero-sample knowledge graph completing device according to an embodiment of the present application.
The apparatus 500 includes: a memory 501 and a processor 502;
The memory 501 is used for storing relevant program codes;
the processor 502 is configured to invoke the program code and execute the zero sample knowledge graph completion method described in the above method embodiment.
In addition, the embodiment of the application also provides a computer readable storage medium for storing a computer program, wherein the computer program is used for executing the zero sample knowledge graph completion method described in the embodiment of the method.
It should be noted that, technical features of the upper means provided in the embodiments of the present application are clear to those skilled in the art, and the problem to be solved by the upper means is also clear, and how to obtain corresponding features may be selected by those skilled in the art according to specific implementation requirements, and the means provided by the present application should not be considered as a limitation to a scheme or as the only implementation means.
It should be noted that, in the present description, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. In particular, for system or apparatus embodiments, since they are substantially similar to method embodiments, the description is relatively simple, with relevant portions being referred to in the description of the method embodiments. The above-described apparatus embodiments are merely illustrative, in which units or modules illustrated as separate components may or may not be physically separate, and components shown as units or modules may or may not be physical modules, i.e. may be located in one place, or may be distributed over multiple network units, where some or all of the units or modules may be selected according to actual needs to achieve the purposes of the embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus and devices, etc., according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It is further noted that, in the present application, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A zero sample knowledge graph completion method, the method comprising:
Determining structural features of the knowledge graph based on the text description of the relation to be predicted;
Determining an embedded representation of the relationship to be predicted based on the textual description and the knowledge-graph structural features;
Acquiring a first feature vector of a candidate entity pair;
a complement triplet is determined based on the similarity between the first feature vector and the embedded representation, the triplet being comprised of a relationship to be predicted and a candidate entity pair.
2. The method of claim 1, wherein the determining knowledge-graph structural features based on the textual description of the relationship to be predicted comprises:
Acquiring the structural features of the knowledge graph based on a structural coding unit; the structure coding unit is trained based on the label triplet vector, the positive sample triplet vector and the negative sample triplet vector.
3. The method of claim 2, wherein the structural coding unit comprises a linear layer and an activation function layer, and wherein the training process of the structural coding unit comprises:
Acquiring the label triplet vector, the positive sample triplet vector and the negative sample triplet vector;
inputting the tag triplet vector, the positive sample triplet vector and the negative sample triplet vector into the linear layer and the activation function layer to obtain tag triplet embedded representation, positive sample embedded representation and negative sample embedded representation;
Determining a loss function based on a euclidean distance between the tag triplet embedded representation and the positive sample embedded representation, a euclidean distance between the tag triplet embedded representation and the negative sample embedded representation;
and adjusting the parameters of the linear layer by taking the value of the loss function as a target to train until the preset training times are reached, and obtaining the trained linear layer.
4. The method of claim 1, wherein the candidate entity pair comprises a head entity and a tail entity, and wherein the obtaining the first feature vector of the candidate entity pair comprises:
acquiring a head entity vector of the head entity and a tail entity vector of the tail entity;
Performing convolution processing on the head entity vector and the tail entity vector to determine an entity pair feature vector;
and processing the feature vector by the entity based on the linear mapping relation and the activation function, and determining the first feature vector.
5. The method of claim 1, wherein the determining a completed triplet based on a similarity between the first feature vector and the embedded representation comprises:
acquiring a second feature vector of a neighbor node of the candidate entity pair;
Determining an enhanced feature vector based on the first feature vector and the second feature vector;
the triples are determined based on a similarity between the enhanced feature vector and the embedded representation.
6. The method of claim 5, wherein the obtaining the second feature vector of the neighbor node of the candidate entity pair comprises:
acquiring neighbor node vectors of the neighbor nodes;
performing dot product attention calculation on the neighbor node vectors, and determining an attention coefficient;
The second feature vector is determined based on the attention coefficient.
7. The method of claim 6, wherein the determining the second feature vector based on the attention coefficient comprises:
determining a third feature vector based on the attention coefficient and the neighbor node vector;
And processing the third feature vector based on the linear mapping relation and the activation function to determine the second feature vector.
8. The method of claim 1, wherein the determining the embedded representation of the relationship to be predicted based on the textual description and the knowledge-graph structural feature comprises:
acquiring text characteristics of the text description;
And carrying out aggregation processing on the text features and the knowledge graph structural features, and determining the embedded representation.
9. The method of claim 8, wherein the obtaining text features of the text description comprises:
Performing text embedding processing on the text description to obtain a text vector;
and extracting the characteristics of the text vector based on a long-short-term memory network LSTM to acquire the text characteristics.
10. A zero-sample knowledge-graph completion device, the device comprising:
the first determining unit is used for determining the structural characteristics of the knowledge graph based on the text description of the relation to be predicted;
A second determining unit, configured to determine an embedded representation of the relationship to be predicted based on the text description and the knowledge-graph structural feature;
An obtaining unit, configured to obtain a first feature vector of a candidate entity pair;
And a third determining unit, configured to determine a complement triplet based on a similarity between the first feature vector and the embedded representation, where the triplet is composed of a relationship to be predicted and a candidate entity pair.
CN202410134194.8A 2024-01-31 2024-01-31 Zero sample knowledge graph completion method and device Pending CN117951308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410134194.8A CN117951308A (en) 2024-01-31 2024-01-31 Zero sample knowledge graph completion method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410134194.8A CN117951308A (en) 2024-01-31 2024-01-31 Zero sample knowledge graph completion method and device

Publications (1)

Publication Number Publication Date
CN117951308A true CN117951308A (en) 2024-04-30

Family

ID=90804640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410134194.8A Pending CN117951308A (en) 2024-01-31 2024-01-31 Zero sample knowledge graph completion method and device

Country Status (1)

Country Link
CN (1) CN117951308A (en)

Similar Documents

Publication Publication Date Title
CN112966074B (en) Emotion analysis method and device, electronic equipment and storage medium
CN109302410B (en) Method and system for detecting abnormal behavior of internal user and computer storage medium
CN109816032B (en) Unbiased mapping zero sample classification method and device based on generative countermeasure network
CN111222305A (en) Information structuring method and device
CN113627447B (en) Label identification method, label identification device, computer equipment, storage medium and program product
CN109684476B (en) Text classification method, text classification device and terminal equipment
CN112905868A (en) Event extraction method, device, equipment and storage medium
CN111428504A (en) Event extraction method and device
CN113761192B (en) Text processing method, text processing device and text processing equipment
CN111553167A (en) Text type identification method and device and storage medium
CN114627312B (en) Zero sample image classification method, system, equipment and storage medium
CN114580354B (en) Information coding method, device, equipment and storage medium based on synonym
CN117951308A (en) Zero sample knowledge graph completion method and device
CN112685594B (en) Attention-based weak supervision voice retrieval method and system
CN114969253A (en) Market subject and policy matching method and device, computing device and medium
CN114494809A (en) Feature extraction model optimization method and device and electronic equipment
Vladimir et al. The method for searching emotional content in images based on low-level parameters with using neural networks
CN111242307A (en) Judgment result obtaining method and device based on deep learning and storage medium
CN111178817A (en) Judgment result obtaining method and device based on deep learning
CN117131256B (en) Media management system based on AIGC
CN111079093B (en) Music score processing method and device and electronic equipment
Mao et al. Multi-classification Sensitive Image Detection Method Based on Lightweight Convolutional Neural Network.
CN113742455B (en) Resume searching method, device, equipment and storage medium based on artificial intelligence
Alla et al. The Method for Searching Emotional Content in Images Based on Low-Level Parameters with Using Neural Networks
Shi et al. Intelligent Information Processing X

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination