CN112818136A

CN112818136A - Time convolution-based interactive knowledge representation learning model TCIM prediction method

Info

Publication number: CN112818136A
Application number: CN202110220136.3A
Authority: CN
Inventors: 汪璟玢; 陆玉乾
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2021-05-18

Abstract

The invention provides a prediction method of an interactive knowledge representation learning model TCIM based on time convolution. The invention captures the characteristic information of time through the convolutional neural network, thereby completing the dynamic knowledge graph.

Description

Time convolution-based interactive knowledge representation learning model TCIM prediction method

Technical Field

The invention relates to the technical field of knowledge maps, in particular to a time convolution-based interactive knowledge representation learning model TCIM prediction method.

Background

Most of the existing models focus on researching the knowledge graph which does not contain time information, and the dynamic knowledge graph not only considers the information of the triples, but also considers the time information related to the triples. Relationships between entities in a knowledge graph may change over time. Such as the ICEWS 2014, ICEWS 2005-15 data set. Over time, the fact triplets contained may also change over time. For example (person wasBornIncity) this fact evolves into another fact (person diedIncity) as the relation evolves with time, and current models based on dynamic knowledge maps are the HyTE model proposed by Dasgupta in 2018, Garcia-Dur n proposes a time-aware representation model using recurrent neural networks to learn the type of relation, TA-DistMult, DE-SimplE, and so on. The main idea of the HyTE model is to project head and tail entities onto a time hyperplane respectively and translate on the time hyperplane. The HyTE model achieves a good effect on a time-aware model, but the evolution of roles of entities in relations is not considered in the evolution process of head and tail entities along with time, and the prediction efficiency on many-to-one and many-to-many relations is low. The TA-DistMult model processes time information into a time sequence, then uses a recurrent neural network to capture effective characteristics between the time sequence and the relationship, so that the relationship between entities can sense time, and finally carries out link prediction through a translation model. The nature of the two models is also a translation model, and the two models are different from the traditional translation model in that time information is added into the two models so as to enhance the expressive force of the models.

Most models today study static knowledge-graphs but ignore the important dimension of time, however, the relationships between entities in triples may change over time, and therefore it becomes especially important to develop time-fused knowledge-graph-embedded models. Existing time-aware models do not adequately consider the global features of extracting temporal information and the interactivity between temporal and triples.

Disclosure of Invention

In view of this, the present invention provides a time convolution-based prediction method for an interactive knowledge representation learning model TCIM, which captures characteristic information of time through a convolutional neural network, so as to complete a dynamic knowledge graph.

The invention is realized by adopting the following scheme: a prediction method of an interactive knowledge representation learning model TCIM based on time convolution is characterized in that feature information of time is captured through a convolution neural network, feature information of triplets is extracted through a circular convolution neural network, and features of the triplets and features of the time are fused and link prediction is carried out.

Further, the characteristic information of the time captured by the convolutional neural network is specifically:

the global features between times are extracted by using convolution nerve, and the year, month and day vector is expressed as (v)_y,v_m,v_d) Treat it as a matrix

Where D is the dimension of the embedding, v_y、v_m、v_dRespectively representing year, month and day; using convolution kernels on convolutional layers

Extracting an embedded time triplet (v)_y,v_m,v_d) The global relationship among the items with the same size is obtained, and the mutual information among the years, the months and the days is obtained at the same time; ω repeats the convolution operation on each line of T to ultimately generate a feature map vector f ═ f₁,f₂,…f_i,...,f_n](ii) a Wherein, the following:

wherein, b is an offset,

is a convolution operation, g is the ReLU activation function;

and splicing feature map vectors generated by different convolution kernels to form a final feature information representation of time information.

Further, the extracting of the feature information of the triplet through the circular convolution neural network specifically includes:

carrying out characteristic arrangement: randomly generating t head entities e_sAnd relation e_rIs shown as

Performing characteristic remodeling: by a reshaping function

Reshaping the feature arrangement, and capturing the maximum interactive features between the entity and the relationship features;

and (3) performing circular filling on the feature map, wherein the feature map after the circular filling is C ═ C₁,c₂,…,c_n]；

Finally, the result obtained by convolution is as follows:

in the formula, b_hIs a bias that is a function of the bias,

is a convolution operation, g is the ReLU activation function, h_iRepresenting characteristic information of the triples.

Further, the arrangement of the features is reshaped by_sAnd e_rIs arranged so that e_sAnd e_rThe middle elements are not adjacent to each other.

Further, the fusing the characteristics of the triples and the characteristics of the time and performing the link prediction specifically includes:

fusing the triple characteristic graphs and the time characteristic graphs, splicing the generated characteristic graphs, multiplying the characteristic graphs by the weight matrix, and performing dot product on the characteristic graphs and the target entity to obtain the final score of the quadruple (h, r, T, T); the final score is formulated as follows:

where x denotes that circular convolution is used,

is a convolution operation, T represents a time matrix,

representing the merging operation of a time vector and a triplet vector, vec representing the joining operation of the vectors, v_tVector matrix representing target entity, W is a learnable parameter matrix, f is activation function, h represents head entity, r represents relation, t represents tail entity, omega₁And Ω₂Two different convolution kernels are represented.

Further, the tcam model is trained using Adam optimizer minimization loss function, and the score function is processed using sigmoid function δ (·), so that P ═ δ (S (h, r, t)), the loss function is as follows:

in the formula, t is a tag vector,

is the number of training data, P_iIs a predicted value, t_iRepresents a group of formulae and P_iA corresponding label.

Further, the fusion method comprises feature addition, feature multiplication or gating fusion.

The invention also provides a prediction system for a time convolution based interactive knowledge representation learning model TCIM, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, are capable of implementing the method steps as described above.

The present invention also provides a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions when executed by the processor being capable of performing the method steps as described above.

Compared with the prior art, the invention has the following beneficial effects:

1. most of the traditional models are used for performing link prediction on a static knowledge graph, the important dimension of time is ignored, and the invention provides a method for capturing characteristic information of time through a convolutional neural network, so that a dynamic knowledge graph can be completed.

2. Most of the existing neural network models do not effectively fuse time characteristic information and triple characteristic information, and the three fusion methods provided by the invention can effectively fuse time and triple characteristics.

Drawings

FIG. 1 is a schematic diagram of a method according to an embodiment of the present invention.

FIG. 2 is a gating diagram according to an embodiment of the present invention.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

As shown in fig. 1, the embodiment provides a prediction method of a time convolution-based interactive knowledge representation learning model TCIM, which captures feature information of time through a convolution neural network, extracts feature information of a triplet through a circular convolution neural network, fuses features of the triplet and features of the time, and performs link prediction.

The present embodiment first gives the following definitions:

definition 1 (quadruplet, S): let S ═ h, r, T denote a quadruple, where h denotes the head entity, r denotes the relationship, T denotes the tail entity, and T denotes time. A quadruple may also be referred to as a knowledge or fact.

Definition 2 (entity set, E): let E ═ E1, E2.., en, denote the set of all entities in the knowledge base.

Definition 3 (set of relationships, R): let the set of relationships R { R1, R2., rn }, denote the set of all relationships in the knowledge base.

In this embodiment, the characteristic information of capturing time by the convolutional neural network specifically includes:

existing temporal knowledge mapping models such as HyTE project entities and relationships onto a hyperplane of time, and TA-DistMult transforms time into time series, features between which are learned using LSTM. But these models ignore the capture of global features within time, so the present embodiment proposes to use a convolutional neural network to extract a model of global features between times.

The set of valid times in the knowledge base G is in the form of years, months, days, denoted as y, m, d. The global features between times are extracted by using convolution nerve, and the year, month and day vector is expressed as (v)_y,v_m,v_d) Treat it as a matrix

Extracting an embedded time triplet (v)_y,v_m,v_d) The global relationship among the items with the same size is obtained, and the mutual information among the years, the months and the days is obtained at the same time; ω repeats the convolution operation on each line of T to ultimately generate a feature map vector f ═ f₁,f₂,…f_i,...,f_n](ii) a Wherein, as follows:

wherein, b is an offset,

is a convolution operation, g is the ReLU activation function;

In this embodiment, the extracting, by using the circular convolutional neural network, the feature information of the triplet specifically includes:

because the prior embedded model based on convolution ignores the interactivity of the entity and the relation, the expressive ability of the model can be strengthened by fully capturing the interactivity between the entity and the relation, and therefore the model has better prediction accuracy. Therefore, in order to enhance the interaction between the entity and the relationship, the embodiment performs three steps, 1) feature arrangement; 2) characteristic remodeling; 3) circular filling, specifically as follows:

For t different permutations, it is expected that the total number of interactions is approximately t times the number of interactions for one permutation.

Performing characteristic remodeling: by a reshaping function

performing circle filling on the feature map, wherein in the horizontal axis, the embodiment fills the right side of the feature map to the leftmost side of the image and vice versa; on the vertical axis, the top right of the image is filled to the top left, and vice versa. Circular fill was applied to each convolutional layer, since the fill in feature space resulted in better performance in the experiment, with the feature map after circular fill being C ═ C₁,c₂,…,c_n]；

Finally, the result obtained by convolution is as follows:

in the formula, b_hIs a bias that is a function of the bias,

In this embodiment, the arrangement of features is reshaped by reshaping e_sAnd e_rIs arranged so that e_sAnd e_rAre not adjacent to each other.

In this embodiment, the fusing the characteristics of the triples and the characteristics of the time and performing the link prediction specifically includes:

where x denotes that circular convolution is used,

is a convolution operation, T represents a time matrix,

representing the merging operation of a time vector and a triplet vector, vec representing the joining operation of the vectors, v_tVector matrix representing target entity, W represents a learnable parameter matrix, f is activation function (in this embodiment, f uses ReLU function), h represents head entity, r represents relation, t represents tail entity, and Ω₁And Ω₂Representing different convolution kernels. .

In this embodiment, the penalty function uses standard binary cross-entropy penalty in conjunction with tag smoothing, employs Adam optimizer to minimize the penalty function to train the TCIM model, and uses sigmoid function δ (·) to process the score function, so P ═ δ (S (h, r, t)), the penalty function is as follows:

in the formula, t is a tag vector,

In this embodiment, the fusion method includes feature addition, feature multiplication, or gated fusion, which is specifically as follows:

1) the features are added. And adding the time characteristic and the relation characteristic to obtain a final characteristic representation.

F＝T_f+E_f

2) The features are multiplied. And multiplying the time characteristic and the relation characteristic to obtain a final characteristic representation. The multiplication here is an element multiplication.

F＝T_f⊙E_f

3) Gated fusion. In the design of the embodiment, a core mechanism gate control mechanism of a long-time and short-time memory neural network (LSTM) is used for reference, effective information is screened by the gate control mechanism, parameter sharing is realized, and the risk of model overfitting is reduced. The calculated model diagram is shown in fig. 2. The formula is as follows:

F＝E_f⊙σ(concat(E_f，T_f)*w_h+b_h)

wherein w_h，b_hIs a parameter matrix to be learned,. is a hadamard product,. is a matrix multiplication,. concat is a join operation,. sigma.is an activation function, we use a sigmoid activation function herein.

The present embodiment also provides a prediction system for a time-convolution based interactive knowledge representation learning model TCIM, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, are capable of implementing the method steps as described above.

The present embodiments also provide a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, being capable of performing the method steps as described above.

Preferably, when the method of the present embodiment is applied to the knowledge-graph complementation, the specific steps are as follows:

1. the triplets (h, r, T) in the knowledge graph are input as models.

2. Carrying out feature rearrangement and feature remodeling on the entity E (including the head entity h and the tail entity r) and the vector of the relation r, and extracting features by using circular convolution to obtain triple feature information E_f。

3. Analyzing the time T into information of (year, month, day), vectorizing the information, and performing convolution operation by using a convolution kernel of 1 x 3 to obtain characteristic information T of the time_f。

4. The triple feature E is then_fAnd time characteristic information T_fAnd performing fusion operation to obtain a final feature fusion vector representation F.

5. And finally, compressing the characteristics of the fused vector F through a full connection layer, multiplying the compressed characteristic representation by all entity vectors, and finally activating by using sigmoid to obtain the score of each triple.

The present embodiment uses the ICEWS dataset, which is the International crisis Warning System (ICEWS) project by Rockschid Martin. The data set for ICEWS14 is shown in the following table:

the evaluation index of this example is Hits @ N: and in the test triple set, the entity ranking is less than or equal to the proportion of N. And (4) ranking the head entity or the tail entity of the original test triple in the top N, adding 1 to the hit times, and adding 0 to the hit times if not. All Hits are then summed and averaged to obtain the value of Hits @ N. A larger Hits @ N indicates better performance.

Finally, the embodiment is verified to obtain the effectiveness of the algorithm of the embodiment, and in the ICEWS14 data set, the model of the embodiment is improved by 0.9% at Hits @10 and 20% at Hits @1 in comparison with a time-lapse embedding (DE-simplex) model for time-aware graph completion in testing. It is concluded from the above results that TCIM using convolutional neural networks can find the characteristics of entities, relationships and time well and achieve better performance.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims

1. A prediction method of an interactive knowledge representation learning model TCIM based on time convolution is characterized in that feature information of time is captured through a convolution neural network, feature information of a triple is extracted through a circular convolution neural network, and features of the triple and features of the time are fused and subjected to link prediction.

2. The method for predicting the interactive knowledge representation learning model TCIM based on the time convolution as claimed in claim 1, wherein the characteristic information of the time captured by the convolutional neural network is specifically:

wherein, b is an offset,

is a convolution operation, g is the ReLU activation function;

3. The prediction method of the interactive knowledge representation learning model TCIM based on time convolution as claimed in claim 1, wherein the extracting of the feature information of the triplet by the circular convolution neural network is specifically:

Performing characteristic remodeling: by a reshaping function

Finally, the result obtained by convolution is as follows:

in the formula, b_hIs a bias that is a function of the bias,

4. The method for predicting the interactive knowledge representation learning model TCIM based on time convolution of claim 3, wherein the method for reshaping the arrangement of the features is to use e_sAnd e_rIs arranged such that e_sAnd e_rThe middle elements are not adjacent to each other.

5. The method for predicting the interactive knowledge representation learning model TCIM based on time convolution according to claim 1, wherein the fusing the characteristics of the triples and the characteristics of the time and performing the link prediction specifically includes:

where x denotes that circular convolution is used,

is a convolution operation, T represents a time matrix,

6. The prediction method for the interactive knowledge representation learning model TCIM based on time convolution is characterized in that an Adam optimizer is adopted to minimize a loss function to train the TCIM model, and a sigmoid function delta () is used to process a score function, so that P ═ delta (S (h, r, t)), the loss function is as follows:

in the formula, t is a tag vector,

is the number of training data, P_iIs the predicted value, t_iRepresents a group of formulae and P_iA corresponding label.

7. The method for predicting the interactive knowledge representation learning model TCIM based on the time convolution is characterized in that the fusion method comprises feature addition, feature multiplication or gating fusion.

8. A prediction system for a time-convolution based interactive knowledge representation learning model TCIM, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions, when executed by the processor, being capable of carrying out the method steps of any one of claims 1 to 7.

9. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, being capable of carrying out the method steps according to any one of claims 1 to 7.