CN116402138A

CN116402138A - Time sequence knowledge graph reasoning method and system for multi-granularity historical aggregation

Info

Publication number: CN116402138A
Application number: CN202310276162.7A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Chengdu Shuzhilian Technology Co Ltd
Current assignee: Chengdu Shuzhilian Technology Co Ltd
Priority date: 2023-03-21
Filing date: 2023-03-21
Publication date: 2023-07-07

Abstract

The invention provides a time sequence knowledge graph reasoning method and a system for multi-granularity historical aggregation, which relate to the field of natural language processing, and the method comprises the following steps: acquiring time sequence knowledge graph data of an event to be inferred; intercepting time sequence knowledge graph data with different lengths to obtain training data; modeling entity representations and relationship representations of different granularities using a historical recursive network; modeling entity representations of different granularities again using a historical convolutional network; calculating event probability based on the entity characterization and relationship characterization modeling results, and constructing a loss function to realize model training; the event of the absence is inferred using the trained model. By fusing history information with different lengths and different granularities, the invention solves the problems that the existing time sequence knowledge graph-based reasoning event does not fully consider the periodicity and repeatability of the history event development, so that the event reasoning accuracy is low and the complex event reasoning requirement is difficult to meet.

Description

Time sequence knowledge graph reasoning method and system for multi-granularity historical aggregation

Technical Field

The invention relates to the field of natural language processing, in particular to a time sequence knowledge graph reasoning method and a system for multi-granularity historical aggregation.

Background

The time sequence knowledge graph is an extension of the knowledge graph, namely, a quadruple is formed by adding time into a triplet representing an event, and a specific time is given to each fact, so that reality is more truly depicted. The time sequence knowledge graph reasoning is one of the main applications of the current time sequence knowledge graph, and aims to achieve the purpose of predicting future events by mining potential evolution rules of historical events.

At present, the task of reasoning the time sequence knowledge spectrum is mainly realized based on characterization learning, a characterization vector is learned for each entity and relation in the time sequence knowledge spectrum, and a score function is constructed for evaluating the possibility of occurrence of a future event. Firstly, a graph convolution neural network mode is generally used, and static state knowledge graph data are learned; modeling the time sequence information of the map transformation by combining a mode of using a cyclic neural network; and constructing characterization vectors of the entity and the relation by using two modes alternately, and finally calculating scores of different triples by adopting a normalized exponential function or a more complex function. The prior time sequence knowledge graph reasoning mainly focuses on how to learn the fact evolution mode of adjacent moments in the history, omits the periodic and repeated description in the development of the history event, leads to lower accuracy of event reasoning, and is difficult to meet the complex event reasoning requirement.

Disclosure of Invention

The invention provides a time sequence knowledge graph reasoning method and a system for multi-granularity historical aggregation, which are used for solving the problems that the prior time sequence knowledge graph reasoning mainly focuses on how to learn the fact evolution mode of adjacent moments in the history, omits the periodic and repeated description in the development of a historical event, leads to lower accuracy of event reasoning and is difficult to meet the complex event reasoning requirement by fusing historical information with different lengths and different granularities.

In a first aspect, an embodiment of the present invention provides a method for reasoning a timing knowledge graph of multi-granularity historical aggregation, where the method includes the following steps:

acquiring time sequence knowledge graph data of an event to be inferred;

intercepting time sequence knowledge graph data with different lengths to obtain training data;

modeling entity representations and relationship representations of different granularities using a historical recursive network based on training data;

modeling entity representations of different granularities again using a historical convolutional network based on training data;

calculating event probability based on the entity characterization and relationship characterization modeling results, and constructing a loss function to realize model training;

the event of the absence is inferred using the trained model.

In the embodiment, the method and the device carry out depth modeling on the historical information with different lengths and different granularities based on the historical recursive network and the historical convolution network, so that the accuracy of future event prediction reasoning is improved; and based on the angle of the characterization learning, the embedded vector is learned for the entity and the relation, and can be used for other tasks such as entity classification, event classification and the like while carrying out future event prediction reasoning.

As some optional embodiments of the present application, the time-series knowledge-graph data includes static knowledge-graph data for a plurality of different time periods.

As some optional embodiments of the present application, the static knowledge-graph data includes a set of entities, a set of relationships, and a set of events.

As some optional embodiments of the present application, the flow of entity characterization and relationship characterization at different granularities using historical recursive network modeling based on training data is as follows:

acquiring historical paths with different lengths based on training data, and aggregating entity characterization and relationship characterization at each moment by adopting a relationship graph convolutional neural network based on the historical paths with different lengths;

and capturing the map change between adjacent moments by alternately using a gating cyclic neural network so as to obtain an entity characterization matrix set and a relation characterization matrix set.

As some optional embodiments of the present application, the flow of modeling entity characterization of different granularity again using a historical convolutional network based on training data is as follows;

acquiring a longest historical path based on training data, and aggregating entity representations at each moment by adopting a relation graph convolutional neural network based on the longest historical path;

and synchronously carrying out multi-level convolution processing on the maps between adjacent moments by using a convolution neural network so as to obtain a composite entity characterization matrix set.

As some optional embodiments of the present application, the process of calculating event probabilities based on entity characterization and relationship characterization modeling results is as follows:

based on the entity characterization matrix set and the relation characterization matrix set, performing feature extraction by adopting a convolutional neural network to obtain a first event feature map;

performing vector inner product calculation based on the first event feature map to obtain a first event score;

based on the composite entity characterization matrix set, performing feature extraction by adopting a convolutional neural network to obtain a second event feature map;

performing vector inner product calculation based on the second event feature map to obtain a second event score;

based on the first event score and the second event score, respectively adopting a normalized exponential function to construct a first event probability vector and a second event probability vector; and a weighted sum is used to construct the final event probability vector.

As some optional embodiments of the present application, the flow of constructing the loss function to achieve model training is as follows:

constructing cross entropy loss functions based on the first event probability vector and the second event probability vector, respectively;

and circularly intercepting time sequence knowledge spectrum data with different lengths to update training data, and performing model training by using a gradient descent method.

In a second aspect, the present invention provides a time-series knowledge graph inference system for multi-granularity historical aggregation, the system comprising:

the historical data acquisition unit is used for acquiring time sequence knowledge graph data of the event to be inferred;

the training data acquisition unit is used for intercepting time sequence knowledge graph data with different lengths so as to acquire training data;

a data modeling unit that models entity representations and relationship representations of different granularities using a historic recursive network based on training data, and models entity representations of different granularities using a historic convolutional network based on training data;

the model training unit calculates event probability based on the entity representation and the relationship representation modeling result and constructs a loss function to realize model training;

and the event reasoning unit is used for reasoning the missing events by using the trained model.

In a third aspect, the present invention provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor performing a time-sequential knowledge-graph inference method of multi-granularity historical aggregation.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for timing knowledge graph inference for multi-granularity historical aggregation.

The beneficial effects of the invention are as follows:

1. the method and the system integrate historical information with different lengths and different granularities, and deeply model the evolution mode of the history, so that the accuracy of future event prediction reasoning is improved.

2. The invention learns the characterization vector for the entity and the relation based on the angle of the characterization learning, thereby being convenient for carrying out other tasks while being convenient for carrying out future event reasoning in the later stage.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a time series knowledge-graph inference method, according to an embodiment of the invention;

FIG. 2 is a data processing flow diagram of a historic recursive network according to an embodiment of the present invention;

FIG. 3 is a data processing flow diagram of a historian convolutional network according to an embodiment of the invention;

fig. 4 is a block diagram of a time series knowledge graph inference system, according to an embodiment of the invention.

Detailed Description

In order to better understand the above technical solutions, the following detailed description of the technical solutions of the present invention is made by using the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and the embodiments of the present invention are detailed descriptions of the technical solutions of the present invention, and not limiting the technical solutions of the present invention, and the technical features of the embodiments and the embodiments of the present invention may be combined with each other without conflict.

It should also be appreciated that in the foregoing description of at least one embodiment of the invention, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of at least one embodiment of the invention. This method of disclosure, however, is not intended to imply that more features than are required by the subject invention. Indeed, less than all of the features of a single embodiment disclosed above.

Example 1

Referring to fig. 1, the invention provides a time sequence knowledge graph reasoning method of multi-granularity historical aggregation, which comprises the following steps:

(1) Acquiring time sequence knowledge graph data of an event to be inferred;

for one or more events to be inferred, acquiring historical time sequence knowledge graph data of the events to be inferred, wherein the time sequence knowledge graph data are presented in a quadruple mode, namely (s, r, o, t), s represents a head entity, r represents a relation, o represents a tail entity, t represents time, dividing the quadruple into different time periods, and constructing static knowledge graph data of each time period, wherein the static knowledge graph data are expressed as follows:

wherein,,

representing complete time-series knowledge-graph data, +.>

And (3) representing static knowledge graph data in different time periods, wherein t+1 is the time at which the event to be inferred is located.

Specifically, the static knowledge graph data includes an entity set, a relationship set and an event set, and is expressed as follows:

where epsilon represents the set of entities,

representing a set of relations->

Representing all event sets, i.e. triplet sets, at time t.

(2) Intercepting time sequence knowledge graph data with different lengths to obtain training data;

i.e. assuming that the events occurring at a certain moment depend only on the history of the first K moments, then at

And selecting continuous K+1 pieces of static knowledge-graph data for model training, and regarding the first K pieces of static knowledge-graph data as histories.

(3) Modeling entity representations and relationship representations of different granularities using a historical recursive network based on training data;

specifically, the flow of modeling entity characterization and relationship characterization of different granularities is as follows:

(3.1) for K static knowledge graph data with continuous histories, sequentially taking K historical paths with different lengths in the direction of reversing the histories, and aggregating entity representation and relationship representation at each moment by adopting a relationship graph convolution neural network (RGCN) based on the historical paths with different lengths;

specifically, referring to fig. 2, t is a target prediction time t _m K history paths with different lengths are sequentially taken in the direction of the history reversing, namely length=k (k=1, …, K), and the history path with length=k can be expressed as a graph sequence:

first, from the beginning of the graph sequence, event information related to each entity is aggregated onto the entity characterization sequence through a relational graph convolutional neural network (RGCN):

wherein,,

representing a head entity characterization sequence,/->

Representing a relationship characterization sequence, ++>

Representing a parameter matrix, c _o Representing the number of adjacent tail entities, k representing the historical path length, l representing the number of layers of RGCN, s representing the head entity, o representing the tail entity, r representing the relation, the related events (s, r, o) all belong to the event set +.>

Adjacent tail entity characterization sequences are +.>

Relation characterization sequence of contact entity +.>

Weighted aggregation to target entity characterization sequence +.>

Meanwhile, the entity characterization sequence of the upper layer is reserved.

(3.2) capturing map changes between adjacent moments alternately using gated recurrent neural networks (GRUs) to obtain a set of entity characterization matrices

And relation characterization matrix set->

Specifically, after aggregation and embedding using RGCN, the change of the map between adjacent moments is captured with a GRU:

all entity characterization matrices at the last moment are here to be

And (3.1) the entity characterization matrix treated by RGCN>

The sequence units considered as adjacent are input into the GRU to obtain all entity characterization matrixes +.>

The GRU is also used to model a relationship characterization matrix for adjacent moments:

wherein R represents the whole relation representation set, R' _t Is obtained by:

wherein,,

representing the set of events having a relation r at time t,/for>

Representing the entity characterization matrix at time t-1. This operation averages all relevant entity vectors at time t into pools and concatenates with the original relationship r to construct a relationship intermediate quantity r' _t ；

Finally, recursively repeating the operation until the entity characterization matrix and the relation characterization matrix at the time t-1 under each length sequence are obtained, thereby obtaining an entity characterization matrix set

And relation characterization matrix set->

(4) Modeling entity representations of different granularities again using a historical convolutional network based on training data;

specifically, the flow of modeling entity characterization of different granularities again is as follows:

(4.1) for the length=k history path, the entity characterization at each time instant is aggregated again with RGCN:

unlike step (3), RGCNs and GRUs are no longer used interchangeably herein to model historical serialization patterns, but instead convolutions are performed synchronously for each moment of entity representation;

namely, RGCN is synchronously used for carrying out multi-level convolution processing so as to obtain composite entity characterization matrixes with different granularities

Namely, a new historical convolution coding mode is adopted, and the entity characterization after convolution is subjected to multi-level convolution:

wherein the method comprises the steps of

Representing dot multiplication between matrices, j represents the network hierarchy in which the embedding is located, and j=2, 3, …, K; in particular the number of the elements to be processed,

for the entity characterization matrix obtained by RGCN operation in step (4.1), the +.>

For a time sequential convolution gate of time period t, it is calculated by:

wherein W is _j A weight matrix represented at the j-th layer, b _j Representing the deviation at the j-th layer, σ represents a normalized exponential function, each embedded unit of each layer aggregating information of two adjacent units of the lower layer, whereby embedded units of each layer aggregate histories of different granularityInformation, wherein the last entity characterization matrix of each layer exactly contains information of histories of different lengths, and the entity characterization matrix of the last unit of each layer is taken out for subsequent use, so as to obtain a multi-granularity composite entity characterization matrix set

Please refer to fig. 3.

(5) Calculating event probability based on the entity characterization and relationship characterization modeling results, and constructing a loss function to realize model training;

specifically, the process of calculating event probabilities based on entity characterization and relationship characterization modeling results is as follows:

(5.1) the entity characterization matrix set obtained based on the step (3)

And relation characterization matrix set->

Performing feature extraction by adopting a convolutional neural network to obtain a first event feature map with the length of k;

specifically, in the encoding of step (3), a set of entity characterization matrices based on the historic recursive network can be obtained

And relation characterization matrix set->

For an event to be inferred (s, r, _m ) The corresponding entity characterization sequence +.>

And the relation characterization sequence->

For the k groups of characterization sequences from different historical path lengths, respectively input toFeatures are extracted from k different convolutional neural networks:

wherein Conv ^k Represents the kth convolutional neural network, c ^k The event feature map when the corresponding history length is k is represented.

(5.2) performing a vector inner product calculation based on the first event feature map and the entity representation or the relationship representation of step (5.1) to obtain a first event score;

wherein o is ^k For the characterization vector of the entity o in the kth historical sequence, FCN represents the neural network of the full-connection layer, and the first event score is obtained by calculating the inner product of the vectors.

(5.3) the composite entity characterization matrix set obtained based on the step (4)

Performing feature extraction by adopting a convolutional neural network to obtain a second event feature map with the length of K;

(5.4) performing vector inner product calculation based on the event feature map of the step (5.3) and the entity representation of the kth static knowledge-graph data to obtain an event score; similarly, entity characterization sequences with different granularities obtained for the history convolution network in the step (4)

For the k groups of characterization sequences from different historical path lengths, respectively inputting the characterization sequences into k different convolutional neural networks to extract features: />

Then calculate a corresponding second event score:

unlike event score computation for historic recursive networks, here entity o must be in the same historic event as entity s.

Specifically, a loss function is constructed to implement the model training process as follows:

(5.5) constructing a first event probability vector and a second event probability vector based on the first event score of step (5.2) and the second event score of step (5.4) using a normalized exponential function, respectively; and constructing a final event probability vector by adopting weighted summation;

that is, first, for the obtained first event score and second event score, a first event probability vector and a second event probability vector are constructed using a normalized exponential function, respectively:

p ^hc ＝Softmax(score _hc )

p ^hr ＝Softmax(score _hr )

p ^final ＝α×p ^hc +(1-α)×p ^hr

according to the obtained first event score and the obtained second event score, event probability corresponding to the extracted features of different networks can be calculated by using a normalized exponential function; to balance the contributions of the two networks to the event score, a superparameter alpha weighting summation is used to obtain a final event probability vector p ^final ；

(5.6) constructing a cross entropy loss function based on the first event probability vector and the second event probability vector, the cross entropy loss function expressed as:

L ^final ＝α×L ^hc +(1-α)×L ^hr

wherein,,

for predicting time t _q Entity tag vector when event (s, r, o, t _m ) In the real presence->

The corresponding entity component is 1, otherwise 0.

(5.7) circularly intercepting time sequence knowledge graph data with different lengths to update training data, and performing model training by using a gradient descent method; taking the next time of the starting time selected in the step (2) as a new starting time, selecting a graph sequence with the length of K+1 again to be used as training, taking the first K subgraphs as a history sequence to predict events in the K+1 subgraphs, and calculating a loss function to use a gradient descent method for training.

(6) Reasoning about missing events using trained model parameters, i.e. when using

After training is completed, the event which really needs to be inferred in the target time t+1 is inferred according to the obtained entity characterization and relationship characterization, and the event with the highest probability is taken as an inference result.

Specifically, it should be noted that lowercase K refers to the length of the history path, and uppercase K refers to the maximum length of the history path, and is a constant value; t is t _m The predicted time and t are not specifically indicated, but represent time.

Example 2

Referring to fig. 4, the present invention provides a time-series knowledge graph inference system for multi-granularity historical aggregation, the system corresponds to the method of embodiment 1 one by one, and the system includes:

Example 3

The invention provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program executes the time sequence knowledge graph reasoning method of the multi-granularity historical aggregation in the embodiment 1 when the processor runs.

The computer device provided in this embodiment may implement the method described in embodiment 1, and in order to avoid repetition, a description thereof will be omitted.

Example 4

The present invention provides a computer readable storage medium having a computer program stored thereon, which when executed by a processor implements a time-series knowledge graph inference method for multi-granularity historical aggregation as described in embodiment 1.

The computer readable storage medium provided in this embodiment may implement the method described in embodiment 1, and will not be described herein in detail to avoid repetition.

The processor may be a central processing unit (CPU, central Processing Unit), other general purpose processors, digital signal processors (digital signal processor), application specific integrated circuits (Application Specific Integrated Circuit), off-the-shelf programmable gate arrays (Fieldprogrammable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory can be used for storing the computer program and/or the module, and the processor can realize various functions of the time sequence knowledge graph reasoning system of the multi-granularity historical aggregation by running or executing the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart memory card, secure digital card, flash memory card, at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The time series knowledge graph inference system of multi-granularity historical aggregation can be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a stand alone product. Based on such understanding that the present invention implements all or part of the flow of the method of the above-described embodiments, the steps of each method embodiment described above may also be implemented by a computer program stored in a computer readable storage medium, where the computer program when executed by a processor. Wherein the computer program comprises computer program code, object code forms, executable files, or some intermediate forms, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, a point carrier signal, a telecommunication signal, a software distribution medium, and the like. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the legislation and the patent practice in the jurisdiction.

Having described the basic concept of the invention, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations to the present disclosure may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this specification, and therefore, such modifications, improvements, and modifications are intended to be included within the spirit and scope of the exemplary embodiments of the present invention.

The computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take on a variety of forms, including electro-magnetic, optical, etc., or any suitable combination thereof. A computer storage medium may be any computer readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated through any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.

Claims

1. A time sequence knowledge graph reasoning method of multi-granularity historical aggregation is characterized by comprising the following steps:

acquiring time sequence knowledge graph data of an event to be inferred;

the event of the absence is inferred using the trained model.

2. The method for reasoning the time sequence knowledge spectrum of multi-granularity historical aggregation according to claim 1, wherein the time sequence knowledge spectrum data comprises static knowledge spectrum data of a plurality of different time periods.

3. The method for reasoning the time sequence knowledge graph of multi-granularity historical aggregation as claimed in claim 2, wherein the static knowledge graph data comprises an entity set, a relation set and an event set.

4. The method for reasoning the time sequence knowledge graph of multi-granularity historical aggregation according to claim 1, wherein the flow of modeling entity characterization and relationship characterization with different granularities by using a historical recursive network based on training data is as follows:

5. The method for reasoning the time sequence knowledge graph of multi-granularity historical aggregation according to claim 4, wherein the process of modeling entity characterization with different granularity again by using a historical convolution network based on training data is as follows;

6. The method for reasoning the time sequence knowledge graph of multi-granularity historical aggregation according to claim 5, wherein the process of calculating the event probability based on the entity characterization and the relationship characterization modeling results is as follows:

7. The method for reasoning the time sequence knowledge graph of multi-granularity historical aggregation as claimed in claim 6, wherein the loss function is constructed to realize the model training process as follows:

8. A time-series knowledge graph reasoning system of multi-granularity historical aggregation, the system comprising:

9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized by: the processor, when executing the computer program, implements a time-series knowledge graph reasoning method of multi-granularity historical aggregation as claimed in any one of claims 1-7.

10. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, the computer program when executed by a processor implementing the time-series knowledge graph inference method of multi-granularity historical aggregation of any one of claims 1-7.