CN114817568B - Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network - Google Patents

Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network Download PDF

Info

Publication number
CN114817568B
CN114817568B CN202210475730.1A CN202210475730A CN114817568B CN 114817568 B CN114817568 B CN 114817568B CN 202210475730 A CN202210475730 A CN 202210475730A CN 114817568 B CN114817568 B CN 114817568B
Authority
CN
China
Prior art keywords
vector
entity
tuple
initial
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210475730.1A
Other languages
Chinese (zh)
Other versions
CN114817568A (en
Inventor
庞俊
徐浩
任亮
林晓丽
张鸿
徐新
张晓龙
李波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Science and Engineering WUSE
Original Assignee
Wuhan University of Science and Engineering WUSE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Science and Engineering WUSE filed Critical Wuhan University of Science and Engineering WUSE
Priority to CN202210475730.1A priority Critical patent/CN114817568B/en
Publication of CN114817568A publication Critical patent/CN114817568A/en
Application granted granted Critical
Publication of CN114817568B publication Critical patent/CN114817568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a knowledge hypergraph link prediction method combining an attention mechanism and a convolutional neural network, which comprises the following steps: s1, loading a knowledge hypergraph to be complemented to obtain entities and relations; s2, initializing the loaded entity and relationship to obtain an initial entity embedding vector and an initial relationship embedding vector; s3, inputting the initial entity embedded vector and the initial relation embedded vector into a ACLP model in a tuple form for training; s4, processing the initial relation embedded vector to obtain a processed relation attention vector; s5, processing the initial entity embedded vector to obtain a processed entity projection embedded vector; and S6, scoring the processed tuples through a preset scoring module to obtain a prediction result, judging whether the scoring result of the tuples is correct, adding the correct tuples into the knowledge hypergraph, and complementing the knowledge hypergraph. The invention ensures that the processed tuple contains more information and improves the link prediction accuracy.

Description

Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network
Technical Field
The invention relates to the technical field of knowledge hypergraphs, in particular to a knowledge hypergraph link prediction method combining a attention mechanism and a convolutional neural network.
Background
The knowledge hypergraph is a hypergraph structure knowledge graph, and can represent the relationship among a plurality of entities in the real world by introducing a hyperedge relationship, and is generalization of the knowledge graph. The knowledge hypergraph is a hypergraph structure formed by entities and hyperrelationships, has the characteristics of nodes and superedges of the hypergraph, and can be used for recording things and relationships in the real world. However, because of the fact that the real world is intricate and difficult to store, existing hypergraphs are often considered incomplete. In order to make the incomplete knowledge hypergraph as complete as possible, it is necessary to complement it. Link prediction aims at complementing the knowledge hypergraph by predicting unknown tuples through existing relations and entities in the knowledge hypergraph, so that the link prediction can alleviate the problem of incompleteness of the knowledge hypergraph.
The prior knowledge hypergraph link prediction largely uses a method based on an embedded representation model, and the method has the advantages that a complex data structure can be mapped into an European space and converted into a vectorized representation, so that the association relationship can be found more easily, and the reasoning is completed. When different tasks are completed, the vectorized representation obtained by the method based on the embedded representation model can be transmitted to a neural network, and the structural features and semantic features in the knowledge hypergraph are deeply learned by using the neural network, so that effective prediction of the missing relation entity in the knowledge hypergraph is realized. However, the conventional method based on the embedded representation model only aims at the entity, ignores the multivariate relation, and only performs initial embedding processing on the multivariate relation, and does not contain more information, so that the algorithm performance is restricted.
Disclosure of Invention
The invention provides a knowledge hypergraph link prediction method combining an attention mechanism and a convolutional neural network in order to overcome the defects of the technology.
Term interpretation:
1. ACLP: attention and Convolution Network Link Prediction, attention is linked to convolutional network predictions.
2. ResidualNet: and (5) a residual network.
3. MLP: multilayerPerceptron, a multilayer perceptron.
4. MRR: mean Reciprocal Rank, representing the average reciprocal rank.
The invention adopts an improved attention mechanism module to enrich the information of the multi-element relation embedded vector, adds the information in the entity into the relation embedded vector according to the weight proportion, and thus obtains the relation attention vector after processing, and enables the relation attention vector to contain more information; in addition, adjacent entity information is added for a convolution kernel for extracting entity characteristics, so that the obtained extraction vector contains information of the number of adjacent entities in the same tuple; in order to prevent excessive initial entity information from being lost during training, the entity projection vector and the same initial entity embedded vector are summed and then participate in link prediction scoring; and finally, optimizing by using a residual network and a multi-layer perceptron, and further improving the link prediction accuracy.
The technical scheme adopted for overcoming the technical problems is as follows:
a knowledge hypergraph link prediction method combining an attention mechanism and a convolutional neural network is used for carrying out reasoning prediction on unknown tuples in the knowledge hypergraph, and at least comprises the following steps:
S1, loading a knowledge hypergraph to be complemented to obtain entities and relations in the knowledge hypergraph;
S2, initializing the entity and the relation obtained by loading in the step S1 to obtain an initial entity embedding vector and an initial relation embedding vector;
S3, inputting the initial entity embedded vector and the initial relation embedded vector obtained in the step S2 into a ACLP model in a tuple form for training, wherein the ACLP model at least comprises an attention mechanism module and a convolutional neural network module;
S4, processing the initial relation embedded vector obtained in the step S2 through the attention mechanism module in the step S3, and adding the information of the entity in the tuple into the relation embedded vector in proportion to the importance degree of the entity in the tuple to obtain a processed relation attention vector;
S5, extracting features of the initial entity embedded vector obtained in the step S2 through the convolutional neural network module in the step S3, and adding the entity adjacent number information in the tuple to a convolutional kernel in the convolutional neural network module to obtain a processed entity projection embedded vector;
S6, scoring the processed tuple through a preset scoring module to obtain a prediction result, and judging whether the scoring result of the tuple is correct or not according to an evaluation index: if the result is correct, adding the correct tuple into the knowledge hypergraph, complementing the knowledge hypergraph, and if the result is incorrect, discarding the incorrect tuple;
Wherein the processed tuple comprises the processed relationship vector and the processed entity vector.
Further, let knowledge hypergraph be a graph composed of vertices and hyperedges, noted as:
KHG={V,E}
In the above formula, v= { V 1,v2,…,v|V| }, represents a set of entities in KHG, and v| represents the number of entities contained in KHG; e= { E 1,e2,…,e|E| }, which represents a set of relationships between entities, i.e. a set of hyperedges, |e| represents the number of hyperedges contained in KHG; any superside e corresponds to a tuple t=e (v 1,v2,…,v|e|), T e τ, |e| represents the number of entities contained in the superside e, i.e. the number of elements of e, τ represents the set of all tuples of the ideal complete target knowledge supergraph KHG.
Further, the step S4 specifically includes:
input in the attention mechanism module is an initial relationship embedding vector for the relationship e i in the tuple And corresponding initial entity embedding vector set/>Wherein/> Representing vectors/>I is equal to or less than 1 and equal to or less than |e|, d e represents the dimension of the relation e when initialized to a vector, and can be predefined,/> Is a matrix of all entity vectors in the relationship e i,/>Representing vectors/>E i represents the number of entities contained in the relationship e i, and d v represents the dimension of entity v when initialized to a vector;
First, a vector is embedded into an initial relationship And initial entity embedding vector set/>Performing tandem operation, then performing linear mapping on the vectors after tandem operation, and processing through LeakyReLU nonlinear functions to obtain an embedded vector set/>, which simultaneously contains an initial entityEmbedding vectors with initial relations/>Projection vector of information/>The calculation process is shown in the formula (1):
In the above-mentioned method, the step of, Representing projection vector/>Dimension,/>The mapping matrix is represented by a mapping matrix, Representing the mapping matrix/>Is used to represent tandem operation;
Projection vector pair by softmax Processing to obtain initial relation embedded vector/>Embedding vector set with initial entity/>Weight vector between/>The calculation of softmax is shown in formula (2):
in the above equation, softmax represents the flexible maximum transfer function, Expressed as e/>To the power of the two,/(I)Representing vectors/>Is the j-th line data of (a);
By passing through And/>The accumulation of the products yields the relationship attention vector/> Representation/>The calculation process is shown in the formula (3):
further, the step S5 specifically includes:
First, convolutional neural network module embeds vector with initial entity As input,/>Convolution kernel containing tuple location information/>Extracting initial entity embedding vector/>In (3), wherein,/>, isThe parameter neb i is then used to convolve the kernel/>Adds neighbor entity number information such that/>The extracted characteristics are changed according to the different numbers of adjacent entities, and convolution embedded vectors/>, which are obtained after convolution processing, are obtainedThe calculation process is shown in formula (4):
In the above-mentioned method, the step of, Line j in the convolution kernel representing the ith position in the tuple,/>R l represents a convolution kernelI represents the convolution kernel length;
To derive a complete mapping vector For the obtained convolution embedded vector/>Tandem operation and linear mapping are performed:
In the above-mentioned method, the step of, Representing a linear mapping matrix,/> Representing the mapping matrix/>Q represents the size of the feature map, q= (d-l/s) +1); after concatenation of multiple vectors into a single vector, the dimension increases by linearly mapping the matrix/>Mapping nq-dimensional vectors into d v -dimensional vectors;
Embedding initial entities into vectors Adding the transformed mapping vector/>Calculating to obtain entity projection embedded vectorThe calculation process is shown in formula (6):
further, the ACLP model also includes an optimization module that includes at least a residual network.
Further, before step S6, the processing, by using the residual network, of the entity projection embedded vector obtained after the processing by the convolutional neural network module specifically includes the following steps:
the residual function F (x) of the residual network adopts a convolutional neural network, and the whole residual network process is shown in a formula (7):
In the above-mentioned method, the step of, Representing the entity residual vector, delta representing the ReLU activation function,/>Convolution kernel representing the i-th position in the tuple,/>R n×l represents the convolution kernel/>N represents the number of convolution kernels at the position, l represents the convolution kernel length, l and n can be predefined, and F (x) maps the result with/>Is a vector of the same dimension.
Further, the optimization module also comprises a multi-layer sensor.
Further, before step S6, the entity residual vector is processed by the multi-layer perceptronThe processing method specifically comprises the following steps:
The multi-layer perceptron obtains the entity residual error vector by the formula (7) As an input layer vector, the input layer vector is connected with an output signal through weights, and the mathematical expression of the information propagation process of the multi-layer sensor is shown as a formula (8):
In the above-mentioned method, the step of, Entity perception vector representing last output of neuron of the layer,/>Representing the transformation parameters of layer x-1 to layer x,/> Representing transformation parameters/>D x represents the dimension of the i-th layer; b x represents the bias parameters of the x-th layer; delta x represents the activation function of the x-th layer.
Further, in step S6, the scoring of the processed tuple by the preset scoring module specifically includes the following steps:
Processing the initial relation embedded vector by the step S4 and the initial entity embedded vector by the step S5, and optimizing by an optimizing module to obtain an entity perception vector Relationship attention vector/>Perceptual vector/>, with all entities within a tupleThe inner product between them scores the tuple T as shown in equation (9):
Further, in step S6, it is determined whether the processed tuple is correct, which specifically includes the following steps:
When predicting, the entity v i replacing the tuple T creates a negative tuple set G neg(T) for any n entities, denoted as T ', T' ∈G neg(T); scoring the tuple T' by adopting a formula (9), and sorting the tuples in G neg(T) in ascending order according to the scoring height to obtain a rank of the tuple T in G neg(T); according to different methods of rank calculation, adopting any one evaluation method of MRR or Hit@n;
MRR represents the average reciprocal rank, and the reciprocal and average value of the rank of the tuple T' in G neg(T) are calculated; the MRR calculation formula is shown as formula (10):
In the above formula, Σ represents traversing and summing the inverse of tuple ranking in G neg(T), and the larger the MRR value is, the better the effect is;
Hit@n represents a class of evaluation methods, and the calculation formula is shown as formula (11):
If rank is not less than n, T' is considered to be a positive tuple, n is 1,3 or 10, and num represents the number of positive tuples; the larger the Hit@n is, the better the effect is.
The beneficial effects of the invention are as follows:
Compared with the traditional knowledge hypergraph link prediction method, the method mainly uses the attention mechanism module to add entity information in the tuple to the relation embedded vector for processing the multi-element relation in the knowledge hypergraph, so that the processed relation attention vector is obtained and contains more information. And adding the number information of adjacent entities in the tuple to the convolutional neural network module, so that the convolutional neural network module can extract more information in the tuple when extracting the entity characteristics. Further, the ACLP model is optimized, the vector passing through the convolutional neural network module is processed by using the residual network, the problem of gradient disappearance is relieved, the model can be continuously learned, and the learned loss value is reduced. In addition, in order to enhance the nonlinear learning capability of the model, a multi-layer perceptron is added behind the residual network, so that the model can learn more features, and the accuracy of knowledge hypergraph link prediction of the model is improved.
Drawings
FIG. 1 is a flow chart of a method of knowledge hypergraph link prediction for a joint attention mechanism and convolutional neural network in accordance with the present invention.
FIG. 2 is a flow chart of relationship and entity information in a rich knowledge hypergraph of the present invention.
FIG. 3 is a block diagram of the modules used in the knowledge hypergraph link prediction method of the joint attention mechanism and convolutional neural network of the present invention.
Fig. 4 is a schematic diagram of the ACLP model of the present invention.
Fig. 5 is a schematic diagram of the calculation process of the relationship attention vector of the present invention.
Fig. 6 is a schematic diagram of the calculation process of the entity projection embedded vector of the present invention.
Detailed Description
The invention will now be described in further detail with reference to the drawings and the specific examples, which are given by way of illustration only and are not intended to limit the scope of the invention, in order to facilitate a better understanding of the invention to those skilled in the art.
As shown in FIG. 1, the invention mainly combines an attention mechanism and a convolutional neural network, so that the relation embedded vector can contain more information, and the convolutional neural network can extract more entity embedded features so as to realize high-precision reasoning on the relation and the entity in the knowledge hypergraph. The invention enriches the relation in the knowledge hypergraph and the information contained in the entity, as shown in figure 2, the invention acquires the entity information and the relation information from the knowledge hypergraph to be complemented, then respectively uses the convolutional neural network module and the attention mechanism module to extract the characteristics in the entity, learns the information contained in the entity near the relation in the same tuple into the relation vector, so that the entity vector and the relation vector both contain more information, and the subsequent scoring module can more effectively judge whether the input tuple is correct or wrong according to the enriched information. If the tuples are determined to be erroneous, the erroneous tuples are discarded, and if the tuples are determined to be correct, the correct tuples are added to the knowledge hypergraph, complementing the knowledge hypergraph. In addition, after the relation embedded vector and the entity embedded vector are processed, the optimization module is optimized, the problem of gradient disappearance in the method is relieved by using a residual network, and the nonlinear learning capacity of the method is enhanced by using a multi-layer perceptron, so that the accuracy of link prediction of the method in a knowledge hypergraph is further improved.
FIG. 3 depicts a block diagram of the modules used in the knowledge hypergraph link prediction method of the joint attention mechanism and convolutional neural network of the present invention, wherein most importantly, the convolutional neural network module and the attention mechanism module respectively process the loaded data to obtain the entity projection embedded vector and the relationship attention vector containing more abundant information. The optimization module comprises a residual network and a multi-layer perceptron, and is used for further enhancing the knowledge hypergraph link prediction effect of the invention, so that the prediction result is more accurate. The scoring module scores the processed relationship attention vector and the processed entity projection embedded vector, judges whether the tuple to which the entity and the relationship belong is correct, discards the incorrect tuple if the tuple is judged to be incorrect, adds the correct tuple into the knowledge hypergraph if the tuple is judged to be correct, and compiles the knowledge hypergraph.
The schematic diagram of the ACLP model of the present invention is shown in fig. 4, and the ACLP model mainly comprises three steps: (1) The relation attention vector is generated, and the calculation process of the relation attention vector is shown in fig. 5; (2) The calculation process of the entity projection embedded vector is shown in fig. 6; (3) scoring the tuple. In fig. 5 and 6, concat represents tandem operation, and project represents linear mapping.
Before the knowledge hypergraph link prediction method of the combined attention mechanism and the convolutional neural network is specifically introduced, definition of a knowledge hypergraph is given first. Let knowledge hypergraph be a graph consisting of vertices and hyperedges, noted as:
KHG={V,E}
In the above formula, v= { V 1,v2,…,v|V| }, represents a set of entities in KHG, and v| represents the number of entities contained in KHG; e= { E 1,e2,…,e|E| }, which represents a set of relationships between entities, i.e. a set of hyperedges, |e| represents the number of hyperedges contained in KHG; any superside e corresponds to a tuple t=e (v 1,v2,…,v|e|), T e τ, |e| represents the number of entities contained in the superside e, i.e. the number of elements of e, τ represents the set of all tuples of the ideal complete target knowledge supergraph KHG.
The knowledge hypergraph link prediction method combining an attention mechanism and a convolutional neural network, which is described in this embodiment, is used for performing inference prediction on unknown tuples in a knowledge hypergraph, as shown in fig. 1 to 6, and includes the following steps:
And step S1, loading the knowledge hypergraph to be complemented to obtain the entity and the relation in the knowledge hypergraph. Specifically, the knowledge hypergraph used in the embodiment is stored in a text form, and is loaded into the ACLP model in a tuple form through a data loading function for processing, the tuples may have identical superedges or entities, and through the identical superedges and the entities, the tuples and the tuples form a link, so that the whole knowledge hypergraph is formed, and rich semantic information is contained, so that facts contained in reality can be reflected.
And S2, initializing the entity and the relation obtained by loading in the step S1 to obtain an initial entity embedding vector and an initial relation embedding vector.
After loading the knowledge hypergraph, the entities and relationships therein need to be initialized, converting them into embedded vectors. The specific initialization mode is similar to a word embedding processing method, a word matrix is obtained according to the number of words and defined dimensions, and the word matrix is multiplied with a randomly initialized embedding matrix to obtain word embedding vectors. According to the invention, entity matrixes and relation matrixes similar to word matrixes are initialized according to the entity information and the relation information, and then multiplied by the randomly initialized matrixes to obtain initial entity embedded vectors and initial relation embedded vectors. The entity and the relation in the knowledge hypergraph are embedded into the continuous vector space, so that the structure information in the knowledge hypergraph is reserved while the calculation is convenient, and the complex data structure is converted into the vectorized representation through the embedded representation, so that convenience is brought to the development of subsequent work. When knowledge hypergraph reasoning is carried out, the embedded representation of the entities and the relations can be used for making the association relation which is difficult to find originally obvious by mapping the association information implicit in the graph structure to Euclidean space, and the knowledge hypergraph link prediction which uses the embedded vector of the entities and the relations for reasoning can be used for completing the task of reasoning better and predicting the entities and the relations of the positions.
And step S3, inputting the initial entity embedded vector and the initial relation embedded vector obtained in the step S2 into a ACLP model in the form of tuples for training, wherein the ACLP model comprises an attention mechanism module, a convolutional neural network module and an optimization module.
Specifically, the attention mechanism module is configured to process the initial relation embedded vector obtained in the step S2, the convolutional neural network module is configured to process the initial entity embedded vector obtained in the step S2, the optimization module includes a residual network and a multi-layer perceptron, the residual network is configured to process the entity projection embedded vector obtained after being processed by the convolutional neural network module, and the multi-layer perceptron is configured to process the entity residual vector.
The whole ACLP model mainly carries out different treatments on the initial entity embedded vector and the initial relation embedded vector in the tuple existing in the knowledge hypergraph, but the purpose of the treatments is to enable the initial entity embedded vector and the initial relation embedded vector to finally contain more information which is beneficial to link prediction. The implementation of these three modules is specifically described below.
And S4, processing the initial relation embedding vector obtained in the step S2 through the attention mechanism module in the step S3, and adding the information of the entity in the tuple into the relation embedding vector in proportion to the importance degree of the entity in the tuple to the relation embedding vector to obtain the processed relation attention vector.
In this embodiment, when the attention mechanism module is used to process the initial relation embedding vector obtained in step S2, the initial relation embedding vector of the relation e i in the tuple is input into the attention mechanism moduleAnd corresponding initial entity embedding vector set/>Wherein/> Representing vectors/>I is equal to or less than 1 and equal to or less than |e|, d e represents the dimension of the relation e when initialized to a vector, and can be predefined,/> Is a matrix of all entity vectors in the relationship e i,Representing vectors/>E i represents the number of entities contained in the relationship e i, and d v represents the dimension of entity v when initialized to a vector;
First, a vector is embedded into an initial relationship And initial entity embedding vector set/>Performing tandem operation, then performing linear mapping on the vectors after tandem operation, and processing through LeakyReLU nonlinear functions to obtain an embedded vector set/>, which simultaneously contains an initial entityVector is embedded in the initial relation of ANDProjection vector of information/>The calculation process is shown in the formula (1):
In the above-mentioned method, the step of, Representing projection vector/>Dimension,/>The mapping matrix is represented by a mapping matrix, Representing the mapping matrix/>Is used to represent tandem operation;
Projection vector pair by softmax Processing to obtain initial relation embedded vector/>Embedding vector set with initial entity/>Weights between/>The calculation of softmax is shown in formula (2):
in the above equation, softmax represents the flexible maximum transfer function, Expressed as e/>To the power of the two,/(I)Representing vectors/>Is the j-th line data of (a);
By passing through And/>The accumulation of the products yields the relationship attention vector/> Representation/>The calculation process is shown in the formula (3):
And S5, extracting features of the initial entity embedded vector obtained in the step S2 through the convolutional neural network module in the step S3, and adding the entity adjacent number information in the tuple to a convolutional kernel in the convolutional neural network module to obtain the processed entity projection embedded vector.
The entity of the knowledge hypergraph can be simultaneously present in different positions of a plurality of polynomials, the number and characteristics of adjacent entities in the same tuple can be different according to the present positions, in order to extract the characteristics according to the position of the entity v i in the tuple to obtain the convolution embedded vector, in this embodiment, firstly, the convolution neural network module uses the initial entity embedded vector to implement the following stepsAs an input there is provided,Convolution kernel containing tuple location information/>Extracting initial entity embedding vector/>Wherein the first and second characteristics, among others,The parameter neb i is then used to convolve the kernel/>Adds neighbor entity number information such that/>The extracted characteristics are changed according to the different numbers of adjacent entities, and convolution embedded vectors/>, which are obtained after convolution processing, are obtainedThe calculation process is shown in formula (4):
In the above-mentioned method, the step of, Line j in the convolution kernel representing the ith position in the tuple,/>R l represents a convolution kernelI represents the convolution kernel length;
To derive a complete mapping vector For the obtained convolution embedded vector/>Tandem operation and linear mapping are performed:
In the above-mentioned method, the step of, Representing a linear mapping matrix,/> Representing the mapping matrix/>Q represents the size of the feature map, q= (d-l/s) +1); after concatenation of multiple vectors into a single vector, the dimension increases by linearly mapping the matrix/>Mapping nq-dimensional vectors into d v -dimensional vectors;
Embedding initial entities into vectors Adding the transformed mapping vector/>Calculating to obtain entity projection embedded vectorThe calculation process is shown in formula (6):
and S6, further processing the vector processed by the attention mechanism module and the convolutional neural network module through an optimization module.
The optimizing module comprises a residual error network ResidualNet and a multi-layer perceptron, and the embodiment processes the entity projection embedded vector processed by the convolutional neural network module through the residual error network to obtain an entity residual error vectorIt is intended to add a constant to the original changing gradient as a new gradient, thereby alleviating the gradient from disappearing. Next, to increase the nonlinear learning ability of the model, a multi-layer perceptron is used to continue the entity residual vector/>Processing to obtain entity perception vector/>The method comprises the following specific steps:
(1) The entity projection embedded vector obtained after being processed by the convolutional neural network module is processed through a residual network, and the method specifically comprises the following steps:
the residual function F (x) of the residual network adopts a convolutional neural network, and the whole residual network process is shown in a formula (7):
In the above-mentioned method, the step of, Representing the entity residual vector, delta representing the ReLU activation function,/>Convolution kernel representing the i-th position in the tuple,/>R n×l represents the convolution kernel/>N represents the number of convolution kernels at the position, l represents the convolution kernel length, and F (x) maps the result with/>Must be vectors of the same dimension.
When the two dimensions are different, a matrix is availablePair/>And performing linear mapping to match the two dimensions, wherein a calculation formula after mapping is shown as a formula (7-1).
The selection of the network layer number of F (x) is very flexible, and more than two layers can be selected; the single layer network is not chosen because when F (x) is chosen as the single layer network, equation (7) is more like a linear layer than other networks, without any advantage. To sum up, in the present embodiment, F (x) selects the double-layer convolutional neural network.
The residual network restores the learning gradient for the model through one-section jump connection; after multi-layer network learning, an entity projection embedded vector is reassigned to the modelLet entity residual vector/>The method has the advantages that the characteristic and structural information of the nodes in the original hypergraph are reserved to a great extent, so that the model always contains the original information of the knowledge hypergraph in continuous learning, the original gradient is recovered, and the gradient disappearance problem is effectively relieved.
(2) In order to further enhance the nonlinear learning capability of the model, the invention adopts a multi-layer perceptron to continuously process the entity residual embedded vector.
The multi-layer perceptron is a model for non-linearly mapping input and output vectors, and the multi-layer perceptron uses the entity residual error vector obtained by the formula (7)As an input layer vector, the input layer vector is connected with an output signal through weights, and the mathematical expression of the information propagation process of the multi-layer sensor is shown as a formula (8):
In the above-mentioned method, the step of, Entity perception vector representing last output of neuron of the layer,/>Representing the transformation parameters of layer x-1 to layer x,/> Representing transformation parameters/>D x represents the dimension of the i-th layer, b x represents the bias parameter of the x-th layer; delta x represents the activation function of the x-th layer.
The number of neuron layers is required to be strictly controlled when using the multi-layer sensor, because if the number of layers is too large, the too strong learning ability of the model may cause overfitting. Experiments show that the training effect is best when four layers of neurons are used, so that the multi-layer perceptron used in the invention adopts double-layer neurons as hidden layers, and the mathematical expression of the information propagation process of the four layers of perceptrons is shown as a formula (8-1):
Step S7, scoring the processed tuple through a preset scoring module to obtain a prediction result, and judging whether the scoring result of the tuple is correct or not according to an evaluation index: if so, adding the correct tuple into the knowledge hypergraph, complementing the knowledge hypergraph, and if not, discarding the wrong tuple. Wherein the processed tuple comprises a processed relationship vector and a processed entity vector, and the processed tuple comprises a relationship attention vector after being optimized by the optimizing module And entity perception vector/>
Scoring the processed tuple by a preset scoring module, and specifically comprises the following steps:
The entity perception vector is obtained after the initial relation embedded vector and the initial entity embedded vector are processed and optimized by an optimization module Relationship attention vector/>Perceptual vector/>, with all entities within a tupleThe inner product between them scores the tuple T as shown in equation (9):
Then, judging whether the processed tuple is correct or not, specifically comprising the following steps:
When predicting, the entity v i replacing the tuple T creates a negative tuple set G neg(T) for any n entities, denoted as T ', T' ∈G neg(T); scoring the tuple T' by adopting a formula (9), and sorting the tuples in G neg(T) in ascending order according to the scoring height to obtain a rank of the tuple T in G neg(T); according to different methods of rank calculation, any one of MRR or Hit@n evaluation methods can be adopted, and in a specific experimental process, both methods are performed to check and ensure the accuracy of a result.
MRR represents the average reciprocal rank, and the reciprocal and average value of the rank of the tuple T' in G neg(T) are calculated; the MRR calculation formula is shown as formula (10):
In the above formula, Σ represents traversing and summing the inverse of tuple ranking in G neg(T), and the larger the MRR value is, the better the effect is;
Hit@n represents a class of evaluation methods, and the calculation formula is shown as formula (11):
If rank is not less than n, T' is considered to be a positive tuple, n is 1,3 or 10, and num represents the number of positive tuples; the larger the Hit@n is, the better the effect is.
The foregoing has described only the basic principles and preferred embodiments of the present invention, and many variations and modifications will be apparent to those skilled in the art in light of the above description, which variations and modifications are intended to be included within the scope of the present invention.

Claims (9)

1. A knowledge hypergraph link prediction method combining an attention mechanism and a convolutional neural network is characterized by comprising the following steps of:
S1, loading a knowledge hypergraph to be complemented to obtain entities and relations in the knowledge hypergraph;
S2, initializing the entity and the relation obtained by loading in the step S1 to obtain an initial entity embedding vector and an initial relation embedding vector;
S3, inputting the initial entity embedded vector and the initial relation embedded vector obtained in the step S2 into a ACLP model in a tuple form for training, wherein the ACLP model at least comprises an attention mechanism module and a convolutional neural network module;
s4, processing the initial relation embedded vector obtained in the step S2 through the attention mechanism module in the step S3, and adding the information of the entity in the tuple into the relation embedded vector in proportion to the importance degree of the entity in the tuple to obtain a processed relation attention vector; the step S4 specifically comprises the following steps:
input in the attention mechanism module is an initial relationship embedding vector for the relationship e i in the tuple And corresponding initial entity embedding vector set/>Wherein/> Representing vectors/>I is equal to or less than 1 and equal to or less than |e|, d e represents the dimension of the relation e when initialized to a vector,/> Is a matrix of all entity vectors in the relationship e i,/>Representing vectors/>E i represents the number of entities contained in the relationship e i, and d v represents the dimension of entity v when initialized to a vector;
First, a vector is embedded into an initial relationship And initial entity embedding vector set/>Performing tandem operation, then performing linear mapping on the vectors after tandem operation, and processing through LeakyReLU nonlinear functions to obtain an embedded vector set/>, which simultaneously contains an initial entityEmbedding vectors with initial relations/>Projection vector of information/>The calculation process is shown in the formula (1):
In the above-mentioned method, the step of, Representing projection vector/>Dimension,/>The mapping matrix is represented by a mapping matrix,Representing the mapping matrix/>Is used to represent tandem operation;
Projection vector pair by softmax Processing to obtain initial relation embedded vector/>Embedding vector set with initial entity/>Weights between/>The calculation of softmax is shown in formula (2):
in the above equation, softmax represents the flexible maximum transfer function, Expressed as e/>To the power of the two,/(I)Representing vectors/>Is the j-th line data of (a);
By passing through And/>The accumulation of the products yields the relationship attention vector/>Representation/>The calculation process is shown in the formula (3):
S5, extracting features of the initial entity embedded vector obtained in the step S2 through the convolutional neural network module in the step S3, and adding the entity adjacent number information in the tuple to a convolutional kernel in the convolutional neural network module to obtain a processed entity projection embedded vector;
S6, scoring the processed tuple through a preset scoring module to obtain a prediction result, and judging whether the scoring result of the tuple is correct or not according to an evaluation index: if the result is correct, adding the correct tuple into the knowledge hypergraph, complementing the knowledge hypergraph, and if the result is incorrect, discarding the incorrect tuple;
Wherein the processed tuple comprises the processed relationship vector and the processed entity vector.
2. The knowledge hypergraph linking prediction method of joint attention mechanism and convolutional neural network according to claim 1, wherein the knowledge hypergraph is a graph consisting of vertices and hyperedges, and is written as:
KHG={V,E}
In the above formula, v= { V 1,v2,...,v|V| }, represents a set of entities in KHG, and v| represents the number of entities contained in KHG; e= { E 1,e2,...,e|E| }, which represents a set of relationships between entities, i.e. a set of hyperedges, |e| represents the number of hyperedges contained in KHG; any one superside e corresponds to a tuple T=e (v 1,v2,...,v|e|), T E The term "e" refers to the number of entities contained in the superb e, i.e., the number of elements of e,/>Representing a set of all tuples of the ideal complete target knowledge hypergraph KHG.
3. The knowledge hypergraph linking prediction method of joint attention mechanism and convolutional neural network according to claim 2, wherein step S5 specifically comprises:
First, convolutional neural network module embeds vector with initial entity As input,/>Convolution kernel containing tuple location information/>Extracting initial entity embedding vector/>In (3), wherein,/>, isThe parameter neb i is then used to convolve the kernel/>Adds neighbor entity number information such that/>The extracted characteristics are changed according to the different numbers of adjacent entities, and convolution embedded vectors/>, which are obtained after convolution processing, are obtainedThe calculation process is shown in formula (4):
In the above-mentioned method, the step of, Line j in the convolution kernel representing the ith position in the tuple,/>R l represents the convolution kernel/>I represents the convolution kernel length;
To derive a complete mapping vector For the obtained convolution embedded vector/>Tandem operation and linear mapping are performed:
In the above-mentioned method, the step of, Representing a linear mapping matrix,/>Representing the mapping matrix/>Q represents the size of the feature map, q= (d-l/s) +1); after concatenation of multiple vectors into a single vector, the dimension increases by linearly mapping the matrix/>Mapping nq-dimensional vectors into d v -dimensional vectors;
Embedding initial entities into vectors Adding the transformed mapping vector/>Calculating to obtain entity projection embedded vector/>The calculation process is shown in formula (6):
4. The knowledge hypergraph linking prediction method of joint attention mechanisms and convolutional neural network of claim 3, wherein said ACLP model further comprises an optimization module, said optimization module comprising at least a residual network.
5. The knowledge hypergraph linking prediction method combining an attention mechanism and a convolutional neural network according to claim 4, wherein before step S6, the entity projection embedded vector obtained after being processed by the convolutional neural network module is processed through a residual network, and specifically includes the following steps:
the residual function F (x) of the residual network adopts a convolutional neural network, and the whole residual network process is shown in a formula (7):
In the above-mentioned method, the step of, Representing the entity residual vector, delta representing the ReLU activation function,/>Convolution kernel representing the i-th position in the tuple,/>R n×l represents the convolution kernel/>N represents the number of convolution kernels at the position, l represents the convolution kernel length, and F (x) maps the result with/>Is a vector of the same dimension.
6. The method of claim 5, wherein the optimization module further comprises a multi-layer perceptron.
7. The knowledge hypergraph linking prediction method combining attention mechanisms and convolutional neural network according to claim 6, wherein prior to step S6, the entity residual vector is calculated by a multi-layer perceptronThe processing method specifically comprises the following steps:
The multi-layer perceptron obtains the entity residual error vector by the formula (7) As an input layer vector, the input layer vector is connected with an output signal through weights, and the mathematical expression of the information propagation process of the multi-layer sensor is shown as a formula (8):
In the above-mentioned method, the step of, Entity perception vector representing last output of neuron of the layer,/>Representing the transformation parameters of layer x-1 to layer x,/>Representing transformation parameters/>D x represents the dimension of the i-th layer; b x represents the bias parameters of the x-th layer; delta x represents the activation function of the x-th layer.
8. The method for knowledge hypergraph link prediction by combining an attention mechanism and a convolutional neural network according to claim 7, wherein in step S6, the processed tuples are scored by a preset scoring module, which specifically comprises the following steps:
Processing the initial relation embedded vector by the step S4 and the initial entity embedded vector by the step S5, and optimizing by an optimizing module to obtain an entity perception vector Relationship attention vector/>Perceptual vector with all entities within a tupleThe inner product between them scores the tuple T as shown in equation (9):
9. The method for knowledge hypergraph link prediction of joint attention mechanism and convolutional neural network according to claim 8, wherein in step S6, it is determined whether the processed tuple is correct, specifically comprising the following steps: when predicting, the entity v i replacing the tuple T creates a negative tuple set G neg(T) for any n entities, denoted as T ', T' ∈G neg(T); scoring the tuple T' by adopting a formula (9), and sorting the tuples in G neg(T) in ascending order according to the scoring height to obtain a rank of the tuple T in G neg(T); according to different methods of rank calculation, adopting any one evaluation method of MRR or Hit@n;
MRR represents the average reciprocal rank, and the reciprocal and average value of the rank of the tuple T' in G neg(T) are calculated; the MRR calculation formula is shown as formula (10):
In the above formula, Σ represents traversing and summing the inverse of tuple ranking in G neg(T), and the larger the MRR value is, the better the effect is;
Hit@n represents a class of evaluation methods, and the calculation formula is shown as formula (11):
if rank is not less than n, T' is considered to be a positive tuple, n is 1, 3 or 10, and num represents the number of positive tuples; the larger the Hit@n is, the better the effect is.
CN202210475730.1A 2022-04-29 2022-04-29 Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network Active CN114817568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210475730.1A CN114817568B (en) 2022-04-29 2022-04-29 Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210475730.1A CN114817568B (en) 2022-04-29 2022-04-29 Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network

Publications (2)

Publication Number Publication Date
CN114817568A CN114817568A (en) 2022-07-29
CN114817568B true CN114817568B (en) 2024-05-10

Family

ID=82511304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210475730.1A Active CN114817568B (en) 2022-04-29 2022-04-29 Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network

Country Status (1)

Country Link
CN (1) CN114817568B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115757806B (en) * 2022-09-21 2024-05-28 清华大学 Super-relationship knowledge graph embedding method and device, electronic equipment and storage medium
CN116186295B (en) * 2023-04-28 2023-07-18 湖南工商大学 Attention-based knowledge graph link prediction method, attention-based knowledge graph link prediction device, attention-based knowledge graph link prediction equipment and attention-based knowledge graph link prediction medium
CN116579425B (en) * 2023-07-13 2024-02-06 北京邮电大学 Super-relationship knowledge graph completion method based on global and local level attention
CN117314266B (en) * 2023-11-30 2024-02-06 贵州大学 Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020140386A1 (en) * 2019-01-02 2020-07-09 平安科技(深圳)有限公司 Textcnn-based knowledge extraction method and apparatus, and computer device and storage medium
CN112417219A (en) * 2020-11-16 2021-02-26 吉林大学 Hyper-graph convolution-based hyper-edge link prediction method
CN112613602A (en) * 2020-12-25 2021-04-06 神行太保智能科技(苏州)有限公司 Recommendation method and system based on knowledge-aware hypergraph neural network
CN112883200A (en) * 2021-03-15 2021-06-01 重庆大学 Link prediction method for knowledge graph completion
CN113051440A (en) * 2021-04-12 2021-06-29 北京理工大学 Link prediction method and system based on hypergraph structure
CN113792768A (en) * 2021-08-27 2021-12-14 清华大学 Hypergraph neural network classification method and device
CN113962358A (en) * 2021-09-29 2022-01-21 西安交通大学 Information diffusion prediction method based on time sequence hypergraph attention neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11526765B2 (en) * 2019-01-10 2022-12-13 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for a supra-fusion graph attention model for multi-layered embeddings and deep learning applications
US11593666B2 (en) * 2020-01-10 2023-02-28 Accenture Global Solutions Limited System for multi-task distribution learning with numeric-aware knowledge graphs

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020140386A1 (en) * 2019-01-02 2020-07-09 平安科技(深圳)有限公司 Textcnn-based knowledge extraction method and apparatus, and computer device and storage medium
CN112417219A (en) * 2020-11-16 2021-02-26 吉林大学 Hyper-graph convolution-based hyper-edge link prediction method
CN112613602A (en) * 2020-12-25 2021-04-06 神行太保智能科技(苏州)有限公司 Recommendation method and system based on knowledge-aware hypergraph neural network
CN112883200A (en) * 2021-03-15 2021-06-01 重庆大学 Link prediction method for knowledge graph completion
CN113051440A (en) * 2021-04-12 2021-06-29 北京理工大学 Link prediction method and system based on hypergraph structure
CN113792768A (en) * 2021-08-27 2021-12-14 清华大学 Hypergraph neural network classification method and device
CN113962358A (en) * 2021-09-29 2022-01-21 西安交通大学 Information diffusion prediction method based on time sequence hypergraph attention neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
改进的胶囊网络知识图谱补全方法;王维美;史一民;李冠宇;;计算机工程;20201231(08);全文 *

Also Published As

Publication number Publication date
CN114817568A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN114817568B (en) Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network
CN110119467B (en) Project recommendation method, device, equipment and storage medium based on session
WO2023024412A1 (en) Visual question answering method and apparatus based on deep learning model, and medium and device
CN112784964A (en) Image classification method based on bridging knowledge distillation convolution neural network
CN112819833B (en) Large scene point cloud semantic segmentation method
KR102203065B1 (en) Triple verification device and method
CN113673594A (en) Defect point identification method based on deep learning network
CN113204633B (en) Semantic matching distillation method and device
CN113190654A (en) Knowledge graph complementing method based on entity joint embedding and probability model
CN114970517A (en) Visual question and answer oriented method based on multi-modal interaction context perception
CN112527993A (en) Cross-media hierarchical deep video question-answer reasoning framework
CN113516133A (en) Multi-modal image classification method and system
CN116975350A (en) Image-text retrieval method, device, equipment and storage medium
CN116187349A (en) Visual question-answering method based on scene graph relation information enhancement
CN112926655B (en) Image content understanding and visual question and answer VQA method, storage medium and terminal
CN110704665A (en) Image feature expression method and system based on visual attention mechanism
CN113935496A (en) Robustness improvement defense method for integrated model
CN116992151A (en) Online course recommendation method based on double-tower graph convolution neural network
CN112069399A (en) Personalized search system based on interactive matching
CN116844004A (en) Point cloud automatic semantic modeling method for digital twin scene
CN115496991A (en) Reference expression understanding method based on multi-scale cross-modal feature fusion
CN115512368A (en) Cross-modal semantic image generation model and method
CN113989566A (en) Image classification method and device, computer equipment and storage medium
Huang et al. Adapted GooLeNet for visual question answering
CN115936073B (en) Language-oriented convolutional neural network and visual question-answering method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant