CN112860918B

CN112860918B - Sequential knowledge graph representation learning method based on collaborative evolution modeling

Info

Publication number: CN112860918B
Application number: CN202110305818.4A
Authority: CN
Inventors: 张嘉昇; 梁爽; 邵杰
Original assignee: Sichuan Artificial Intelligence Research Institute Yibin
Current assignee: Sichuan Artificial Intelligence Research Institute Yibin
Priority date: 2021-03-23
Filing date: 2021-03-23
Publication date: 2023-03-14
Anticipated expiration: 2041-03-23
Also published as: CN112860918A

Abstract

The invention provides a sequential knowledge graph representation learning method based on collaborative evolution modeling, which belongs to the technical field of sequential knowledge graphs, and initializes the parameters of a model and the embedded representation of any entity and relationship according to the sequential knowledge graph to be represented; calculating to obtain the occurrence probability of each known fact, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known facts; calculating the corresponding soft modularity for the graph structure of each time sequence knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure; calculating to obtain an integral loss function of the model; and iteratively optimizing the overall loss function of the model by using a gradient descent method until the model converges. The invention solves the problem that the accurate embedded representation cannot be obtained because the evolution essence of the time sequence knowledge graph is ignored in the past work.

Description

Sequential knowledge graph representation learning method based on collaborative evolution modeling

Technical Field

The invention belongs to the technical field of time sequence knowledge maps, and particularly relates to a time sequence knowledge map representation learning method based on collaborative evolution modeling.

Background

Knowledge graph is a knowledge base system with semantic attributes, and is widely used for storage and management of structured data in various fields, such as dynamic social interaction. The knowledge graph can be represented as a heterogeneous directed graph, where nodes represent entities and concepts in the real world and directed edges with labels represent relationships between them. Although many knowledge graph representation learning methods are proposed at present, the dynamics of knowledge graphs are rarely considered by the knowledge graph representation learning methods, especially the evolution essence of the knowledge graph is ignored, and the update iteration of knowledge is reflected on the knowledge graph and is represented by the appearance and disappearance of entities or the establishment and the removal of relationships, so that the knowledge graph has time-varying property and evolution. Existing work ignores the temporal nature of knowledge-graphs, making the embedded representations they learn inaccurate and unreasonable.

In recent years, some work has attempted to learn embedded representations for such time-varying knowledge-graphs, also known as chronology-knowledge-graph representation learning, which includes mainly four types of methods. A time-series relationship dependency-based approach that aims to incorporate time information by constraining the objective order of occurrence between relationships; a temporal hyperplane-based approach that learns the embedded representation at each time separately by mapping the knowledge at different times onto different hyperplanes; a method of embedding on a duration-based entity that treats an embedded representation of the entity as a time-dependent non-linear function; a tensor decomposition-based approach that learns an embedded representation of a temporal knowledge graph using a low-rank decomposition of adjacency matrices.

However, the above works either learn the embedded representation for each time instant independently, ignoring the evolutionary nature of the time-series knowledge graph; or the evolution essence is simplified into the nonlinear dynamics of the entity, and the detailed evolution mechanism of the time sequence knowledge graph cannot be reflected. In fact, from a local structural point of view, as time progresses, relationships are continuously established or released between entities, and thus evolution of the time sequence knowledge graph is driven. From the perspective of the global structure, a large number of relationships are established and released to jointly form a slow evolution process of a community structure in a time sequence knowledge graph, meanwhile, local structure evolution and global structure evolution are not independent, the local structure evolution is an internal mechanism of the global structure evolution, the global structure evolution is an external driving factor of the local structure evolution, and the more accurate time sequence knowledge graph embedding expression can be learned by considering the collaborative evolution process of the local structure and the global structure, so that the point is not considered in the prior art.

Disclosure of Invention

Aiming at the defects in the prior art, the sequential knowledge graph representation learning method based on collaborative evolution modeling has the innovative points that the evolution process of sequential knowledge is modeled from two angles of a local structure and a global structure at the same time, and a new soft modularity is provided for measuring the community structure.

In order to achieve the above purpose, the invention adopts the technical scheme that:

the scheme provides a sequential knowledge graph representation learning method based on collaborative evolution modeling, which comprises the following steps:

s1, initializing parameters of a model and embedded representation of any entity and relationship according to a time sequence knowledge graph to be represented;

s2, inputting the known facts of the time sequence knowledge graph to calculate the occurrence probability of each known fact according to the sequence of the corresponding time stamps of the facts in the time sequence knowledge graph, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known facts;

s3, inputting time sequence knowledge graph snapshots of the time sequence knowledge graph under each time stamp in a time sequence, calculating corresponding soft modularity for the graph structure of each time sequence knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the overall structure;

s4, calculating to obtain an overall loss function of the model according to the evolution loss of the local structure and the evolution loss of the global structure;

s5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of entities and relations;

and S6, judging whether the model is converged, if so, obtaining the final entity and relationship embedded representation, finishing the learning of the time sequence knowledge graph representation, and otherwise, returning to the step S1.

The beneficial effects of the invention are: the invention designs a novel sequential knowledge graph representation learning method based on co-evolution, which can model the evolution process of knowledge from two aspects of local evolution and global evolution and capture the internal mechanism of knowledge evolution, thereby learning more accurate representation vectors to improve the performance of downstream tasks such as event prediction and the like. Compared with the prior method, the method provided by the invention has higher operation efficiency and can adapt to the online environment of streaming data.

Further, the step S1 initializes the embedded representation u τ of any entity e under the timestamp τ _e The expression of (a) is as follows:

wherein, theta _e 、ω _e And v _e All represent directions specific to the current entityAmount (v).

The beneficial effects of the above further scheme are: different policy evolution modes of different entities can be fully considered, such as: periodic evolution strategies, non-periodic evolution trends and static attributes.

Still further, the step S2 includes the steps of:

s201, inputting the known facts of the current time sequence knowledge graph according to the sequence of the corresponding time stamps tau of the facts in the time sequence knowledge graph, and calculating the spontaneous occurrence intensity of the facts according to the participants of any known fact (S, r, o and tau)

Wherein the participants are entities s, o and relations r contained in the known facts;

s202, utilization occurs in tau _i The historical fact of the moment plays the role of the excitation of the current dynamic fact

The method is divided into two parts:

wherein eta is _s,r (τ _i ) And η _o,r (τ _i ) Respectively representing the head entity s and the tail entity o in the current dynamic fact at tau _i The effect of the historical fact of the time of day on the current dynamic fact,

is expressed at tau _i The set of relationships that entity e has at the time,

the attention of the relationship level is indicated,

and Z _r An embedded representation of the relationship in the representation history fact,

representing the relation contained in the historical event, V representing a parameter matrix for measuring the similarity between relation vectors, and h representing tau _i An entity, beta, in a temporal relationship with the entity e _h,x Which is indicative of the attention of the entity,

denotes h is at τ _i The vector representation at a time instant is,

denotes x is at τ _i Vector representation at time, x represents entity e having relationship in current dynamic fact, r' represents tau _i At any moment, one of the relations of the entity e is shown, h' represents one of h specific,

denotes τ _i A set of entities having a relationship with entity e at the time,

denotes h' at τ _i Vector representation at time;

s203, strength of spontaneous occurrence based on the fact

And the excitation of current dynamic facts

Dividing the two parts, and calculating the occurrence intensity of the known fact (s, r, o, tau)

S204, intensity of occurrence of (S, r, o, τ) according to the known fact

Calculating the probability p (s, r, o | I (tau)) of each known fact;

s205, according to the occurrence probability of each known fact, calculating by maximizing the occurrence probability of the fact to obtain the evolution loss L of the local structure _local ：

Where I (τ) represents the set of historical event components before the time instant τ.

The beneficial effects of the further scheme are as follows: adaptive importance weighting is applied to different historical events to flexibly account for different effects of different historical facts on the current fact.

Further, the intensity of this fact spontaneously occurring in the step S201

The expression of (a) is as follows:

wherein the content of the first and second substances,

and

respectively representing the embedded representation of the head entity s and the tail entity o in a fact under a time stamp, Z _r And (5) representing the embedded representation corresponding to the relation r, and w representing a learning parameter matrix for measuring the similarity between the vectors.

The beneficial effects of the further scheme are as follows: the spontaneous fact at any moment can be effectively identified.

Still further, the occurrence intensity of the fact (S, r, o, τ) is known in the step S203

The expression of (a) is as follows:

wherein, the first and the second end of the pipe are connected with each other,

representing the original, factually occurring intensity, theta representing the hyper-parameter,

represents the excitation effect of the historical fact on the current fact, tau represents the occurrence time of the current fact, tau _i Representing the time of occurrence of the historical event, k (τ - τ) _i ) Representing a time decay function.

The beneficial effects of the further scheme are as follows: the influence of the spontaneous intensity and the historical fact of the fact on the fact is considered at the same time, and the occurrence intensity of the fact can be fully modeled.

Still further, the expression of the probability p (S, r, o | I (τ)) of occurrence of each known fact in the step S204 is as follows:

wherein the content of the first and second substances,

indicating the occurrence intensity of the candidate fact (e, r, o, τ),

representing the occurrence strength of the candidate facts (s, r, e, tau), e representing any entity in the entity set, epsilon representing the entity set of a time-series knowledge graph, I (tau) representing the set of historical events before the time tau, s representing the head entity contained in the current fact, r representing the relation contained in the current fact, and o representing the tail entity contained in the current fact.

The beneficial effects of the above further scheme are: the probability of occurrence of valid facts is substantially maximized.

Still further, the step S3 includes the steps of:

s301, inputting the time sequence knowledge graph snapshot of the time sequence knowledge graph under each time stamp in time sequence, and calculating to obtain the connection strength between the two entities

S302, according to the connection strength

Calculating to obtain a soft modularity corresponding to the graph structure of each time sequence knowledge graph snapshot, wherein each element in the soft modularity

The expression of (a) is as follows:

wherein the content of the first and second substances,

and

respectively representing the degree of entity i and entity j at the time stamp τ, m ^τ Representing the total number of relations existing in the time sequence knowledge graph under the tau time stamp;

s303, calculating to obtain a community distribution vector of each entity

S304, according to the community distribution vector of each entity, maximizing the soft modularity to obtain the evolution loss L of the global structure _global 。

The beneficial effects of the further scheme are as follows: the dynamics and the heterogeneity of the time sequence knowledge graph can be fully considered.

Still further, the connection strength between the two entities in the step S301

The expression of (c) is as follows:

wherein r represents a set

In the above-mentioned relation, the relation of any one of,

representing the set of relationships, Z, existing between entity i and entity j under the time stamp of τ _r A vector representing the relation r, a parameter vector for measuring the connection strength of different relations,

representing a non-linear activation function.

The beneficial effects of the above further scheme are: different connection strengths between entities brought by different relationships can be flexibly considered.

Still further, the community allocation vector of each entity in the step S303

The expression of (a) is as follows:

wherein F represents a parameter matrix for mapping the embedded representation of the entity to a community allocation vector of the entity,

the embedded representation of the representation entity i under the time stamp tau,

and representing the embedded representation corresponding to the community to which the entity i belongs in the last timestamp.

The beneficial effects of the further scheme are as follows: the community division of the entity can be calculated based on the topological structure of the time sequence knowledge graph under the current timestamp and the slow evolution characteristic of the community.

Still further, the evolution loss L of the global structure in the step S304 _global The expression of (a) is as follows:

wherein T represents a transposed symbol, m ^τ Representing the total number of relations existing in the time-sequence knowledge graph at the time stamp tau, tr (-) representing the trace of the matrix, H ^τ Represents the community allocation matrix at the timestamp tau,

representing a soft block degree matrix, norm (·) representing a two-norm regularization, H ^τ Representing the community assignment matrix at the timestamp τ.

The beneficial effects of the further scheme are as follows: the method can simplify the maximization process of the soft modularity and accelerate the convergence of the model.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

FIG. 2 is a flowchart of a method applied to a dynamic social network.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined by the appended claims, and all changes that can be made by the invention using the inventive concept are intended to be protected.

Examples

As shown in fig. 1, the invention provides a sequential knowledge graph representation learning method based on collaborative evolution modeling, which is implemented as follows:

in this embodiment, an embedded representation of any entity e under a timestamp τ is initialized

The expression of (c) is as follows:

wherein, theta _e 、ω _e And v _e Represent vectors specific to the current entity.

S2, inputting the known facts of the time sequence knowledge graph to calculate the occurrence probability of each known fact according to the sequence of the corresponding time stamps of the facts in the time sequence knowledge graph, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known facts, wherein the implementation method comprises the following steps:

The method is divided into two parts:

s203, intensity of spontaneous generation based on the fact

And the excitation of current dynamic facts

S204, intensity of occurrence of (S, r, o, τ) according to the known fact

Calculating the probability p (s, r, o | I (tau)) of each known fact;

s205, calculating to obtain the evolution loss L of the local structure by maximizing the occurrence probability of the fact according to the occurrence probability of each known fact _local 。

In this embodiment, in order to consider the influence of the historical fact on the occurrence probability of the current fact, the invention will first occur in τ _i The influence of the historical fact of the moment on the current time is decomposed into two parts:

wherein eta is _s,r (τ _i ) And η _o,r (τ _i ) Respectively representing head entity s and tail entity o in current dynamic fact at tau _i The effect of historical facts at the time of day on the current dynamic facts. For each entity, different historical facts have different effects on the current fact since their different historical facts will be connected to different entities through different relationships, and for this reason, the present invention will τ _i All historical facts of entity e under the timestamp are considered as a hierarchy and their impact on the current fact is quantified as follows:

where e represents an entity (s or o) that considers the impact of a historical fact, x represents a target entity in the historical fact (when e is s, x is o),

is expressed at tau _i The set of relationships that entity e has at the time,

is at tau _i Entity under time stamp _i Existence relationship

Represents a parameter matrix for measuring the similarity between the relationship vectors. In order to model different importance of different historical facts to current facts, the invention uses a hierarchical attention mechanism to calculate relationship-level attentiveness respectively

And attention at the entity level beta _h,x The relationship level attention is calculated as follows:

wherein the content of the first and second substances,

and Z _r An embedded table representing relationships in historical facts, the entity level attention is calculated as follows:

and

is an embedded representation of the target entity in the historical fact under the corresponding timestamp.

In the present embodiment, the intensity spontaneously occurs according to this fact

And the influence of the current fact is divided into two parts

The intensity of occurrence of the known fact (s, r, o, τ) is calculated:

since the above equation may obtain negative values, and the probability of occurrence is a positive number of 1 or less, the present invention converts the above occurrence strength into a positive number by an exponential function:

Therefore, the probability p (s, r, o | I (τ)) of occurrence of each known fact can be obtained:

wherein the content of the first and second substances,

indicating the occurrence intensity of the candidate fact (e, r, o, τ),

In this embodiment, the probability of each known fact occurring is maximized by minimizing a loss function:

s3, inputting the time sequence knowledge graph snapshots of the time sequence knowledge graph under each time stamp in a time sequence, calculating corresponding soft modularity for the graph structure of each time sequence knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure, wherein the implementation method comprises the following steps:

S302, according to the connection strength

Calculating to obtain a soft modularity corresponding to the graph structure of each time sequence knowledge graph snapshot;

s303, calculating to obtain a community distribution vector of each entity

In this embodiment, in the process of modeling the community structure of the time-series knowledge graph, considering that different connection strengths may be brought by using different relationship connections between entities, the connection strength between two entities is first calculated according to the following formula:

wherein r represents a set

In the above-mentioned relation, the relation of any one of,

represents the set of relationships, Z, that exist between entity i and entity j under the time stamp of τ _r A vector representing the relation r, a parameter vector for measuring the connection strength of different relations,

representing a non-linear activation function.

Based on this, a soft module matrix of the time-series knowledge graph under each time stamp can be obtained, and each element in the matrix is obtained by the following method:

wherein the content of the first and second substances,

and

respectively representing the degree of entity i and entity j at the time stamp τ, m ^τ Representing the total number of relationships that exist for the time-series knowledge graph at the time stamp of tau.

In order to maximize the soft modularity of the time-series knowledge-graph at each timestamp, the invention needs to obtain the community allocation vector of each entity. Considering that entities in the time-series knowledge graph have multiple types, and the same entity may belong to multiple different communities at the same time, soft community allocation is allowed to be performed on the entities, and the community allocation of each entity is obtained through the following formula:

representing the embedded representation of entity i under time stamp tau,

and representing the embedded representation corresponding to the community to which the entity i belongs in the timestamp.

In this embodiment, the soft modularity of the timing knowledge graph under each timestamp is finally maximized by minimizing the following loss function:

s4, calculating to obtain an overall loss function L of the model according to the evolution loss of the local structure and the evolution loss of the global structure:

L＝L _local +L _global

and S6, judging whether the model is converged, if so, obtaining the final entity and relationship embedded representation, and finishing the learning of the time sequence knowledge graph representation, otherwise, returning to the step S1.

The embedded representation of the time sequence knowledge graph is learned by simultaneously modeling the local structure evolution and the global structure evolution of the time sequence knowledge graph by the model, so that the embedded representation learned by the model can effectively capture the evolution essence of the time sequence knowledge graph. The time sequence point process based on the level attention can consider various evolution modes of entity semantics and calculate different influences for different historical events, thereby effectively modeling the establishment of the relationship between entities. The model can effectively model dynamic community division in the time sequence knowledge graph based on the soft modularity and learn the evolution process of the time sequence knowledge graph on the macroscopic level. As shown in Table 1, table 1 is a comparative table of the results of the experiments.

TABLE 1

Example 2

The present invention is further described below.

For any dynamic social network in the real world, the dynamic social network is represented as a time sequence knowledge graph for describing the relationship between entities by means of entity disambiguation, relationship extraction and the like, the obtained time sequence knowledge graph is input into the proposed model to obtain embedded representations corresponding to the social entities and the relationship through gradient descent optimization, and then the embedded representations are used for describing score functions of fact credibility so as to measure the credibility of each candidate fact, and the fact with the highest credibility is selected from the embedded representations to supplement the original dynamic social network, as shown in fig. 2, the implementation method is as follows:

a1, initializing parameters of a model and embedded representation of any social entity and relationship according to a current dynamic social knowledge graph to be represented;

in this embodiment, an embedded representation of any social entity e under a timestamp τ is initialized

The expression of (c) is as follows:

wherein, theta _e 、ω _e And v _e Represent vectors specific to the current social entity.

A2, inputting the known social facts of the current dynamic social knowledge graph to calculate and obtain the occurrence probability of each known social fact according to the sequence of the fact corresponding to the time stamps in the social knowledge graph, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known social facts, wherein the implementation method comprises the following steps:

a201, inputting the known social facts of the current dynamic social timing knowledge graph according to the sequence of the fact corresponding to the time stamp tau in the social knowledge graph, and calculating the spontaneous occurrence strength of the social facts according to the participants of any known social facts (s, r, o, tau)

Wherein the participants are social entities s, o and social relations r contained in known social facts;

a202, utilization occurs in tau _i Historical social facts at the moment will serve as incentives for the current dynamic social facts

The method is divided into two parts:

a203, strength of spontaneous occurrence according to the social fact

Incentives with current dynamic social facts

Dividing the two parts, and calculating the occurrence intensity of the known social facts (s, r, o, tau)

A204, occurrence intensity according to the known social facts (s, r, o, τ)

Calculating the probability p (s, r, o | I (tau)) of occurrence of each known social fact;

a205, according to the occurrence probability of each known social fact, calculating by maximizing the occurrence probability of the fact to obtain the evolution loss L of the local structure _local 。

In this embodiment, in order to consider the influence of the historical fact on the occurrence probability of the current fact, the invention first occurs in τ _i The influence of the historical fact of the moment on the current time is decomposed into two parts:

wherein eta _s,r (τ _i ) And η _o,r (τ _i ) Respectively representing head social entity s and tail social entity o in the current dynamic social fact at tau _i The impact of historical social facts at the time of day on the current dynamic social facts. For each entity, different historical facts have different effects on the current fact since their different historical facts will be connected to different entities through different relationships, and for this reason, the present invention will τ _i All historical facts of entity e under the timestamp are considered as a hierarchy and their impact on the current fact is quantified as follows:

where e represents a social entity (s or o) that considers the impact of historical social facts, x represents a target entity in the historical facts (when e is s, x is o),

is expressed at tau _i The set of relationships that entity e has at the time,

is at τ _i Entity under time stamp _i Existence relationship

Represents a parameter matrix for measuring the similarity between the relationship vectors. In order to model different importance of different historical facts to the current fact, the invention uses a hierarchical attention mechanism to respectively calculate the relationship level attention

And social entity level attention beta _h,x The relationship level attention is calculated as follows:

wherein the content of the first and second substances,

and Z _r An embedded table representing relationships in historical facts, the social entity level attention is calculated as follows:

wherein the content of the first and second substances,

and

And the influence of the current fact is divided into two parts

The intensity of occurrence of the known fact (s, r, o, τ) is calculated:

wherein the content of the first and second substances,

representing primitive factsThe intensity of occurrence, theta, represents a hyper-parameter,

Thus, the probability p (s, r, o | I (τ)) that each known social fact occurs can be found:

wherein the content of the first and second substances,

representing the occurrence intensity of the candidate social facts (e, r, o, τ),

representing the occurrence strength of the candidate social facts (s, r, e, τ), e representing any social entity in the set of entities, epsilon representing the set of entities of a social knowledge graph, I (τ) representing the set of historical events before the time τ, s representing the head social entity contained by the current fact, r representing the social relationship contained by the current fact, and o representing the tail social entity contained by the current fact.

a3, inputting the social knowledge graph snapshots of the current dynamic social knowledge graph under each time stamp in a time sequence, calculating corresponding soft modularity for the graph structure of each social knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure, wherein the implementation method comprises the following steps:

a301, inputting in time sequenceCalculating the connection strength between two social entities according to the social knowledge graph snapshot of the front dynamic social knowledge graph under each time stamp

A302, according to the connection strength

Calculating to obtain a soft modularity corresponding to the graph structure of each social knowledge graph snapshot;

a303, calculating to obtain a community distribution vector of each social entity

A304, according to the community distribution vector of each social entity, maximizing the soft modularity to obtain the evolution loss L of the global structure _global 。

wherein r represents a set

Any of the social relationships in (a) or (b),

represents the set of social relationships that exist between social entity i and social entity j at the time stamp τ, Z _r A vector representation representing the social relationship r, a parameter vector for measuring the strength of the connection of different relationships,

representing a non-linear activation function.

Based on the above, a soft module matrix of the time-series knowledge graph under each time stamp can be obtained, and each element in the matrix is obtained by the following steps:

wherein the content of the first and second substances,

and

respectively representing the degree of social entity i and social entity j at the timestamp τ, m ^τ Representing the total number of relationships that the social knowledge graph exists at the τ timestamp.

In order to maximize the soft modularity of the timing knowledge graph at each timestamp, the present invention requires obtaining a community allocation vector for each social entity. Considering that the entities in the time-series knowledge graph have multiple types, and the same social entity may belong to multiple different communities at the same time, soft community allocation is allowed for the social entity, and community allocation of each entity is obtained through the following formula:

wherein F represents a parameter matrix for mapping the embedded representation of the social entity to a social assignment vector of the social entity,

representing an embedded representation of a social entity i under a timestamp tau,

and representing the embedded representation corresponding to the community to which the social entity i belongs in the last timestamp.

a4, calculating to obtain an overall loss function L of the model according to the evolution loss of the local structure and the evolution loss of the global structure:

L＝L _local +L _global

a5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of social entities and relationships;

and A6, judging whether the model is converged, if so, obtaining the final embedded representation of the social entity and the relationship, and finishing the learning of the representation of the timing sequence knowledge graph, otherwise, returning to the step A1.

Claims

1. A time sequence knowledge graph representation learning method based on co-evolution modeling is characterized by comprising the following steps:

s1, initializing parameters of a model and embedded representation of any social entity and relationship according to a current dynamic social timing knowledge graph to be represented;

s2, inputting known social facts of the current dynamic social timing knowledge graph to calculate according to the sequence of the fact corresponding to the timestamps in the social timing knowledge graph to obtain the occurrence probability of each known social fact, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known social facts;

s3, inputting the social timing knowledge graph snapshots of the current dynamic social timing knowledge graph under each timestamp in a time sequence, calculating corresponding soft modularity for the graph structure of each social timing knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure;

the step S3 includes the steps of:

s301, inputting the current dynamic social timing knowledge graph in time sequenceCalculating the connection strength between two social entities according to the social timing knowledge graph snapshot under each timestamp

S302, according to the connection strength

Calculating to obtain a soft modularity corresponding to the graph structure of each social timing knowledge graph snapshot, wherein each element in the soft modularity

The expression of (a) is as follows:

wherein the content of the first and second substances,

and

respectively representing the degree of social entity i and social entity j at the timestamp τ, m ^τ Representing a total number of relationships that the social timing knowledge graph exists at the τ timestamp;

s303, calculating to obtain a community distribution vector of each social entity

S304, according to the community distribution vector of each social entity, maximizing the soft modularity to obtain the evolution loss L of the global structure _global ；

The connection strength between the two entities in the step S301

The expression of (a) is as follows:

wherein r represents a set

Any of the social relationships in (a) or (b),

representing a non-linear activation function;

the community allocation vector of each entity in the step S303

The expression of (c) is as follows:

representing an embedded representation corresponding to the community to which the social entity i belongs in the last timestamp;

the evolution loss L of the global structure in the step S304 _global The expression of (a) is as follows:

wherein T represents a transposition symbol, m ^τ Representing the total number of relations existing in the time-sequence knowledge graph at the time stamp tau, tr (-) representing the trace of the matrix, H ^τ Represents the community allocation matrix at the timestamp tau,

representing a soft modularity matrix, norm (·) representing two-norm regularization;

s5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of social entities and social relations;

and S6, judging whether the model is converged, if so, obtaining the final embedded representation of the social entity and the social relation, and finishing the learning of the time sequence knowledge graph representation, otherwise, returning to the step S1.

2. The method for learning sequential knowledge graph representation based on co-evolution modeling according to claim 1, wherein the step S1 is to initialize the embedded representation of any social entity e under the timestamp τ

The expression of (a) is as follows:

wherein, theta _e 、ω _e And v _e Are all shown asA vector specific to the current social entity.

3. The method for learning sequential knowledge graph representation based on co-evolution modeling according to claim 1, wherein the step S2 comprises the following steps:

s201, inputting the known social facts of the current dynamic social timing knowledge graph according to the sequence of the corresponding timestamps tau of the facts in the social timing knowledge graph, and calculating the spontaneous occurrence intensity of the social facts according to the participants of any known social facts (S, r, o and tau)

s202, utilization occurs in tau _i Historical social facts at the moment will serve as incentives for current dynamic social facts

The method is divided into two parts:

wherein eta is _s,r (τ _i ) And η _o,r (τ _i ) Respectively representThe head social entity s and the tail social entity o in the current dynamic social fact are at τ _i The impact of historical social facts at the time on the current dynamic social facts,

is expressed at tau _i The set of relationships that social entity e has at the moment,

a relationship-level of attention is indicated,

and Z _r An embedded representation representing a relationship in a historical social fact,

representing the relation contained in the historical event, V representing a parameter matrix for measuring the similarity between relation vectors, and h representing tau _i Social entity, beta, having a relationship with social entity e at the moment _h,x Which is indicative of the attention of the entity,

denotes h is at τ _i The vector representation at a time instant is,

denotes x is at τ _i Vector representation at time instant, x represents the entity that social entity e has a relationship in the current dynamic social fact, r' represents τ _i At any moment, one of the social relations of the social entity e is given, h' represents one specific to h,

denotes τ _i A set of entities having a social relationship with the social entity e at the moment,

denotes h' at τ _i Vector representation at time instant;

s203, strength of spontaneous occurrence according to the social fact

Incentives with current dynamic social facts

S204, according to the occurrence intensity of the known social facts (S, r, o, tau)

s205, calculating to obtain the evolution loss L of the local structure by maximizing the occurrence probability of the fact according to the occurrence probability of each known social fact _local ：

4. The method for learning sequential knowledge graph representation based on co-evolution modeling according to claim 3, wherein the strength of the fact spontaneously occurring in the step S201

The expression of (c) is as follows:

wherein the content of the first and second substances,

and

representing embedded representations of head and tail social entities s and o, respectively, in a social fact under a timestamp, Z _r Representing the embedded representation corresponding to the social relationship r, and w representing a learning parameter matrix for measuring the similarity between the vectors.

5. The method as claimed in claim 3, wherein the occurrence intensities of the known facts (S, r, o, τ) in step S203 are determined according to the evolutionary modeling-based sequential knowledge graph representation learning method

The expression of (a) is as follows:

represents the excitation effect of the historical fact on the current fact, tau represents the occurrence time of the current fact, tau _i Indicating the time of occurrence of the historical event, k (tau-tau) _i ) Representing time decayA function.

6. The method for learning representation of time-series knowledge graph based on co-evolution modeling according to claim 3, wherein the expression of the probability p (S, r, oI (τ)) of each known fact occurrence in step S204 is as follows:

wherein the content of the first and second substances,

representing the occurrence strength of the candidate social facts (s, r, e, τ), e representing any social entity in the set of entities, epsilon representing the set of entities of a social timing knowledge graph, I (τ) representing the set of historical events before τ, s representing the head social entity contained by the current fact, r representing the social relationship contained by the current fact, and o representing the tail social entity contained by the current fact.