CN112860918B - Sequential knowledge graph representation learning method based on collaborative evolution modeling - Google Patents
Sequential knowledge graph representation learning method based on collaborative evolution modeling Download PDFInfo
- Publication number
- CN112860918B CN112860918B CN202110305818.4A CN202110305818A CN112860918B CN 112860918 B CN112860918 B CN 112860918B CN 202110305818 A CN202110305818 A CN 202110305818A CN 112860918 B CN112860918 B CN 112860918B
- Authority
- CN
- China
- Prior art keywords
- social
- representing
- entity
- knowledge graph
- fact
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a sequential knowledge graph representation learning method based on collaborative evolution modeling, which belongs to the technical field of sequential knowledge graphs, and initializes the parameters of a model and the embedded representation of any entity and relationship according to the sequential knowledge graph to be represented; calculating to obtain the occurrence probability of each known fact, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known facts; calculating the corresponding soft modularity for the graph structure of each time sequence knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure; calculating to obtain an integral loss function of the model; and iteratively optimizing the overall loss function of the model by using a gradient descent method until the model converges. The invention solves the problem that the accurate embedded representation cannot be obtained because the evolution essence of the time sequence knowledge graph is ignored in the past work.
Description
Technical Field
The invention belongs to the technical field of time sequence knowledge maps, and particularly relates to a time sequence knowledge map representation learning method based on collaborative evolution modeling.
Background
Knowledge graph is a knowledge base system with semantic attributes, and is widely used for storage and management of structured data in various fields, such as dynamic social interaction. The knowledge graph can be represented as a heterogeneous directed graph, where nodes represent entities and concepts in the real world and directed edges with labels represent relationships between them. Although many knowledge graph representation learning methods are proposed at present, the dynamics of knowledge graphs are rarely considered by the knowledge graph representation learning methods, especially the evolution essence of the knowledge graph is ignored, and the update iteration of knowledge is reflected on the knowledge graph and is represented by the appearance and disappearance of entities or the establishment and the removal of relationships, so that the knowledge graph has time-varying property and evolution. Existing work ignores the temporal nature of knowledge-graphs, making the embedded representations they learn inaccurate and unreasonable.
In recent years, some work has attempted to learn embedded representations for such time-varying knowledge-graphs, also known as chronology-knowledge-graph representation learning, which includes mainly four types of methods. A time-series relationship dependency-based approach that aims to incorporate time information by constraining the objective order of occurrence between relationships; a temporal hyperplane-based approach that learns the embedded representation at each time separately by mapping the knowledge at different times onto different hyperplanes; a method of embedding on a duration-based entity that treats an embedded representation of the entity as a time-dependent non-linear function; a tensor decomposition-based approach that learns an embedded representation of a temporal knowledge graph using a low-rank decomposition of adjacency matrices.
However, the above works either learn the embedded representation for each time instant independently, ignoring the evolutionary nature of the time-series knowledge graph; or the evolution essence is simplified into the nonlinear dynamics of the entity, and the detailed evolution mechanism of the time sequence knowledge graph cannot be reflected. In fact, from a local structural point of view, as time progresses, relationships are continuously established or released between entities, and thus evolution of the time sequence knowledge graph is driven. From the perspective of the global structure, a large number of relationships are established and released to jointly form a slow evolution process of a community structure in a time sequence knowledge graph, meanwhile, local structure evolution and global structure evolution are not independent, the local structure evolution is an internal mechanism of the global structure evolution, the global structure evolution is an external driving factor of the local structure evolution, and the more accurate time sequence knowledge graph embedding expression can be learned by considering the collaborative evolution process of the local structure and the global structure, so that the point is not considered in the prior art.
Disclosure of Invention
Aiming at the defects in the prior art, the sequential knowledge graph representation learning method based on collaborative evolution modeling has the innovative points that the evolution process of sequential knowledge is modeled from two angles of a local structure and a global structure at the same time, and a new soft modularity is provided for measuring the community structure.
In order to achieve the above purpose, the invention adopts the technical scheme that:
the scheme provides a sequential knowledge graph representation learning method based on collaborative evolution modeling, which comprises the following steps:
s1, initializing parameters of a model and embedded representation of any entity and relationship according to a time sequence knowledge graph to be represented;
s2, inputting the known facts of the time sequence knowledge graph to calculate the occurrence probability of each known fact according to the sequence of the corresponding time stamps of the facts in the time sequence knowledge graph, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known facts;
s3, inputting time sequence knowledge graph snapshots of the time sequence knowledge graph under each time stamp in a time sequence, calculating corresponding soft modularity for the graph structure of each time sequence knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the overall structure;
s4, calculating to obtain an overall loss function of the model according to the evolution loss of the local structure and the evolution loss of the global structure;
s5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of entities and relations;
and S6, judging whether the model is converged, if so, obtaining the final entity and relationship embedded representation, finishing the learning of the time sequence knowledge graph representation, and otherwise, returning to the step S1.
The beneficial effects of the invention are: the invention designs a novel sequential knowledge graph representation learning method based on co-evolution, which can model the evolution process of knowledge from two aspects of local evolution and global evolution and capture the internal mechanism of knowledge evolution, thereby learning more accurate representation vectors to improve the performance of downstream tasks such as event prediction and the like. Compared with the prior method, the method provided by the invention has higher operation efficiency and can adapt to the online environment of streaming data.
Further, the step S1 initializes the embedded representation u τ of any entity e under the timestamp τ e The expression of (a) is as follows:
wherein, theta e 、ω e And v e All represent directions specific to the current entityAmount (v).
The beneficial effects of the above further scheme are: different policy evolution modes of different entities can be fully considered, such as: periodic evolution strategies, non-periodic evolution trends and static attributes.
Still further, the step S2 includes the steps of:
s201, inputting the known facts of the current time sequence knowledge graph according to the sequence of the corresponding time stamps tau of the facts in the time sequence knowledge graph, and calculating the spontaneous occurrence intensity of the facts according to the participants of any known fact (S, r, o and tau)Wherein the participants are entities s, o and relations r contained in the known facts;
s202, utilization occurs in tau i The historical fact of the moment plays the role of the excitation of the current dynamic factThe method is divided into two parts:
wherein eta is s,r (τ i ) And η o,r (τ i ) Respectively representing the head entity s and the tail entity o in the current dynamic fact at tau i The effect of the historical fact of the time of day on the current dynamic fact,is expressed at tau i The set of relationships that entity e has at the time,the attention of the relationship level is indicated,and Z r An embedded representation of the relationship in the representation history fact,representing the relation contained in the historical event, V representing a parameter matrix for measuring the similarity between relation vectors, and h representing tau i An entity, beta, in a temporal relationship with the entity e h,x Which is indicative of the attention of the entity,denotes h is at τ i The vector representation at a time instant is,denotes x is at τ i Vector representation at time, x represents entity e having relationship in current dynamic fact, r' represents tau i At any moment, one of the relations of the entity e is shown, h' represents one of h specific,denotes τ i A set of entities having a relationship with entity e at the time,denotes h' at τ i Vector representation at time;
s203, strength of spontaneous occurrence based on the factAnd the excitation of current dynamic factsDividing the two parts, and calculating the occurrence intensity of the known fact (s, r, o, tau)
S204, intensity of occurrence of (S, r, o, τ) according to the known factCalculating the probability p (s, r, o | I (tau)) of each known fact;
s205, according to the occurrence probability of each known fact, calculating by maximizing the occurrence probability of the fact to obtain the evolution loss L of the local structure local :
Where I (τ) represents the set of historical event components before the time instant τ.
The beneficial effects of the further scheme are as follows: adaptive importance weighting is applied to different historical events to flexibly account for different effects of different historical facts on the current fact.
Further, the intensity of this fact spontaneously occurring in the step S201The expression of (a) is as follows:
wherein the content of the first and second substances,andrespectively representing the embedded representation of the head entity s and the tail entity o in a fact under a time stamp, Z r And (5) representing the embedded representation corresponding to the relation r, and w representing a learning parameter matrix for measuring the similarity between the vectors.
The beneficial effects of the further scheme are as follows: the spontaneous fact at any moment can be effectively identified.
Still further, the occurrence intensity of the fact (S, r, o, τ) is known in the step S203The expression of (a) is as follows:
wherein, the first and the second end of the pipe are connected with each other,representing the original, factually occurring intensity, theta representing the hyper-parameter,represents the excitation effect of the historical fact on the current fact, tau represents the occurrence time of the current fact, tau i Representing the time of occurrence of the historical event, k (τ - τ) i ) Representing a time decay function.
The beneficial effects of the further scheme are as follows: the influence of the spontaneous intensity and the historical fact of the fact on the fact is considered at the same time, and the occurrence intensity of the fact can be fully modeled.
Still further, the expression of the probability p (S, r, o | I (τ)) of occurrence of each known fact in the step S204 is as follows:
wherein the content of the first and second substances,indicating the occurrence intensity of the candidate fact (e, r, o, τ),representing the occurrence strength of the candidate facts (s, r, e, tau), e representing any entity in the entity set, epsilon representing the entity set of a time-series knowledge graph, I (tau) representing the set of historical events before the time tau, s representing the head entity contained in the current fact, r representing the relation contained in the current fact, and o representing the tail entity contained in the current fact.
The beneficial effects of the above further scheme are: the probability of occurrence of valid facts is substantially maximized.
Still further, the step S3 includes the steps of:
s301, inputting the time sequence knowledge graph snapshot of the time sequence knowledge graph under each time stamp in time sequence, and calculating to obtain the connection strength between the two entities
S302, according to the connection strengthCalculating to obtain a soft modularity corresponding to the graph structure of each time sequence knowledge graph snapshot, wherein each element in the soft modularityThe expression of (a) is as follows:
wherein the content of the first and second substances,andrespectively representing the degree of entity i and entity j at the time stamp τ, m τ Representing the total number of relations existing in the time sequence knowledge graph under the tau time stamp;
S304, according to the community distribution vector of each entity, maximizing the soft modularity to obtain the evolution loss L of the global structure global 。
The beneficial effects of the further scheme are as follows: the dynamics and the heterogeneity of the time sequence knowledge graph can be fully considered.
Still further, the connection strength between the two entities in the step S301The expression of (c) is as follows:
wherein r represents a setIn the above-mentioned relation, the relation of any one of,representing the set of relationships, Z, existing between entity i and entity j under the time stamp of τ r A vector representing the relation r, a parameter vector for measuring the connection strength of different relations,representing a non-linear activation function.
The beneficial effects of the above further scheme are: different connection strengths between entities brought by different relationships can be flexibly considered.
Still further, the community allocation vector of each entity in the step S303The expression of (a) is as follows:
wherein F represents a parameter matrix for mapping the embedded representation of the entity to a community allocation vector of the entity,the embedded representation of the representation entity i under the time stamp tau,and representing the embedded representation corresponding to the community to which the entity i belongs in the last timestamp.
The beneficial effects of the further scheme are as follows: the community division of the entity can be calculated based on the topological structure of the time sequence knowledge graph under the current timestamp and the slow evolution characteristic of the community.
Still further, the evolution loss L of the global structure in the step S304 global The expression of (a) is as follows:
wherein T represents a transposed symbol, m τ Representing the total number of relations existing in the time-sequence knowledge graph at the time stamp tau, tr (-) representing the trace of the matrix, H τ Represents the community allocation matrix at the timestamp tau,representing a soft block degree matrix, norm (·) representing a two-norm regularization, H τ Representing the community assignment matrix at the timestamp τ.
The beneficial effects of the further scheme are as follows: the method can simplify the maximization process of the soft modularity and accelerate the convergence of the model.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a flowchart of a method applied to a dynamic social network.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined by the appended claims, and all changes that can be made by the invention using the inventive concept are intended to be protected.
Examples
As shown in fig. 1, the invention provides a sequential knowledge graph representation learning method based on collaborative evolution modeling, which is implemented as follows:
s1, initializing parameters of a model and embedded representation of any entity and relationship according to a time sequence knowledge graph to be represented;
in this embodiment, an embedded representation of any entity e under a timestamp τ is initializedThe expression of (c) is as follows:
wherein, theta e 、ω e And v e Represent vectors specific to the current entity.
S2, inputting the known facts of the time sequence knowledge graph to calculate the occurrence probability of each known fact according to the sequence of the corresponding time stamps of the facts in the time sequence knowledge graph, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known facts, wherein the implementation method comprises the following steps:
s201, inputting the known facts of the current time sequence knowledge graph according to the sequence of the corresponding time stamps tau of the facts in the time sequence knowledge graph, and calculating the spontaneous occurrence intensity of the facts according to the participants of any known fact (S, r, o and tau)Wherein the participants are entities s, o and relations r contained in the known facts;
s202, utilization occurs in tau i The historical fact of the moment plays the role of the excitation of the current dynamic factThe method is divided into two parts:
s203, intensity of spontaneous generation based on the factAnd the excitation of current dynamic factsDividing the two parts, and calculating the occurrence intensity of the known fact (s, r, o, tau)
S204, intensity of occurrence of (S, r, o, τ) according to the known factCalculating the probability p (s, r, o | I (tau)) of each known fact;
s205, calculating to obtain the evolution loss L of the local structure by maximizing the occurrence probability of the fact according to the occurrence probability of each known fact local 。
In this embodiment, in order to consider the influence of the historical fact on the occurrence probability of the current fact, the invention will first occur in τ i The influence of the historical fact of the moment on the current time is decomposed into two parts:
wherein eta is s,r (τ i ) And η o,r (τ i ) Respectively representing head entity s and tail entity o in current dynamic fact at tau i The effect of historical facts at the time of day on the current dynamic facts. For each entity, different historical facts have different effects on the current fact since their different historical facts will be connected to different entities through different relationships, and for this reason, the present invention will τ i All historical facts of entity e under the timestamp are considered as a hierarchy and their impact on the current fact is quantified as follows:
where e represents an entity (s or o) that considers the impact of a historical fact, x represents a target entity in the historical fact (when e is s, x is o),is expressed at tau i The set of relationships that entity e has at the time,is at tau i Entity under time stamp i Existence relationshipRepresents a parameter matrix for measuring the similarity between the relationship vectors. In order to model different importance of different historical facts to current facts, the invention uses a hierarchical attention mechanism to calculate relationship-level attentiveness respectivelyAnd attention at the entity level beta h,x The relationship level attention is calculated as follows:
wherein the content of the first and second substances,and Z r An embedded table representing relationships in historical facts, the entity level attention is calculated as follows:
wherein, the first and the second end of the pipe are connected with each other,andis an embedded representation of the target entity in the historical fact under the corresponding timestamp.
In the present embodiment, the intensity spontaneously occurs according to this factAnd the influence of the current fact is divided into two partsThe intensity of occurrence of the known fact (s, r, o, τ) is calculated:
since the above equation may obtain negative values, and the probability of occurrence is a positive number of 1 or less, the present invention converts the above occurrence strength into a positive number by an exponential function:
wherein, the first and the second end of the pipe are connected with each other,representing the original, factually occurring intensity, theta representing the hyper-parameter,represents the excitation effect of the historical fact on the current fact, tau represents the occurrence time of the current fact, tau i Representing the time of occurrence of the historical event, k (τ - τ) i ) Representing a time decay function.
Therefore, the probability p (s, r, o | I (τ)) of occurrence of each known fact can be obtained:
wherein the content of the first and second substances,indicating the occurrence intensity of the candidate fact (e, r, o, τ),representing the occurrence strength of the candidate facts (s, r, e, tau), e representing any entity in the entity set, epsilon representing the entity set of a time-series knowledge graph, I (tau) representing the set of historical events before the time tau, s representing the head entity contained in the current fact, r representing the relation contained in the current fact, and o representing the tail entity contained in the current fact.
In this embodiment, the probability of each known fact occurring is maximized by minimizing a loss function:
s3, inputting the time sequence knowledge graph snapshots of the time sequence knowledge graph under each time stamp in a time sequence, calculating corresponding soft modularity for the graph structure of each time sequence knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure, wherein the implementation method comprises the following steps:
s301, inputting the time sequence knowledge graph snapshot of the time sequence knowledge graph under each time stamp in time sequence, and calculating to obtain the connection strength between the two entities
S302, according to the connection strengthCalculating to obtain a soft modularity corresponding to the graph structure of each time sequence knowledge graph snapshot;
S304, according to the community distribution vector of each entity, maximizing the soft modularity to obtain the evolution loss L of the global structure global 。
In this embodiment, in the process of modeling the community structure of the time-series knowledge graph, considering that different connection strengths may be brought by using different relationship connections between entities, the connection strength between two entities is first calculated according to the following formula:
wherein r represents a setIn the above-mentioned relation, the relation of any one of,represents the set of relationships, Z, that exist between entity i and entity j under the time stamp of τ r A vector representing the relation r, a parameter vector for measuring the connection strength of different relations,representing a non-linear activation function.
Based on this, a soft module matrix of the time-series knowledge graph under each time stamp can be obtained, and each element in the matrix is obtained by the following method:
wherein the content of the first and second substances,andrespectively representing the degree of entity i and entity j at the time stamp τ, m τ Representing the total number of relationships that exist for the time-series knowledge graph at the time stamp of tau.
In order to maximize the soft modularity of the time-series knowledge-graph at each timestamp, the invention needs to obtain the community allocation vector of each entity. Considering that entities in the time-series knowledge graph have multiple types, and the same entity may belong to multiple different communities at the same time, soft community allocation is allowed to be performed on the entities, and the community allocation of each entity is obtained through the following formula:
wherein F represents a parameter matrix for mapping the embedded representation of the entity to a community allocation vector of the entity,representing the embedded representation of entity i under time stamp tau,and representing the embedded representation corresponding to the community to which the entity i belongs in the timestamp.
In this embodiment, the soft modularity of the timing knowledge graph under each timestamp is finally maximized by minimizing the following loss function:
s4, calculating to obtain an overall loss function L of the model according to the evolution loss of the local structure and the evolution loss of the global structure:
L=L local +L global
s5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of entities and relations;
and S6, judging whether the model is converged, if so, obtaining the final entity and relationship embedded representation, and finishing the learning of the time sequence knowledge graph representation, otherwise, returning to the step S1.
The embedded representation of the time sequence knowledge graph is learned by simultaneously modeling the local structure evolution and the global structure evolution of the time sequence knowledge graph by the model, so that the embedded representation learned by the model can effectively capture the evolution essence of the time sequence knowledge graph. The time sequence point process based on the level attention can consider various evolution modes of entity semantics and calculate different influences for different historical events, thereby effectively modeling the establishment of the relationship between entities. The model can effectively model dynamic community division in the time sequence knowledge graph based on the soft modularity and learn the evolution process of the time sequence knowledge graph on the macroscopic level. As shown in Table 1, table 1 is a comparative table of the results of the experiments.
TABLE 1
Example 2
The present invention is further described below.
For any dynamic social network in the real world, the dynamic social network is represented as a time sequence knowledge graph for describing the relationship between entities by means of entity disambiguation, relationship extraction and the like, the obtained time sequence knowledge graph is input into the proposed model to obtain embedded representations corresponding to the social entities and the relationship through gradient descent optimization, and then the embedded representations are used for describing score functions of fact credibility so as to measure the credibility of each candidate fact, and the fact with the highest credibility is selected from the embedded representations to supplement the original dynamic social network, as shown in fig. 2, the implementation method is as follows:
a1, initializing parameters of a model and embedded representation of any social entity and relationship according to a current dynamic social knowledge graph to be represented;
in this embodiment, an embedded representation of any social entity e under a timestamp τ is initializedThe expression of (c) is as follows:
wherein, theta e 、ω e And v e Represent vectors specific to the current social entity.
A2, inputting the known social facts of the current dynamic social knowledge graph to calculate and obtain the occurrence probability of each known social fact according to the sequence of the fact corresponding to the time stamps in the social knowledge graph, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known social facts, wherein the implementation method comprises the following steps:
a201, inputting the known social facts of the current dynamic social timing knowledge graph according to the sequence of the fact corresponding to the time stamp tau in the social knowledge graph, and calculating the spontaneous occurrence strength of the social facts according to the participants of any known social facts (s, r, o, tau)Wherein the participants are social entities s, o and social relations r contained in known social facts;
a202, utilization occurs in tau i Historical social facts at the moment will serve as incentives for the current dynamic social factsThe method is divided into two parts:
a203, strength of spontaneous occurrence according to the social factIncentives with current dynamic social factsDividing the two parts, and calculating the occurrence intensity of the known social facts (s, r, o, tau)
A204, occurrence intensity according to the known social facts (s, r, o, τ)Calculating the probability p (s, r, o | I (tau)) of occurrence of each known social fact;
a205, according to the occurrence probability of each known social fact, calculating by maximizing the occurrence probability of the fact to obtain the evolution loss L of the local structure local 。
In this embodiment, in order to consider the influence of the historical fact on the occurrence probability of the current fact, the invention first occurs in τ i The influence of the historical fact of the moment on the current time is decomposed into two parts:
wherein eta s,r (τ i ) And η o,r (τ i ) Respectively representing head social entity s and tail social entity o in the current dynamic social fact at tau i The impact of historical social facts at the time of day on the current dynamic social facts. For each entity, different historical facts have different effects on the current fact since their different historical facts will be connected to different entities through different relationships, and for this reason, the present invention will τ i All historical facts of entity e under the timestamp are considered as a hierarchy and their impact on the current fact is quantified as follows:
where e represents a social entity (s or o) that considers the impact of historical social facts, x represents a target entity in the historical facts (when e is s, x is o),is expressed at tau i The set of relationships that entity e has at the time,is at τ i Entity under time stamp i Existence relationshipRepresents a parameter matrix for measuring the similarity between the relationship vectors. In order to model different importance of different historical facts to the current fact, the invention uses a hierarchical attention mechanism to respectively calculate the relationship level attentionAnd social entity level attention beta h,x The relationship level attention is calculated as follows:
wherein the content of the first and second substances,and Z r An embedded table representing relationships in historical facts, the social entity level attention is calculated as follows:
wherein the content of the first and second substances,andis an embedded representation of the target entity in the historical fact under the corresponding timestamp.
In the present embodiment, the intensity spontaneously occurs according to this factAnd the influence of the current fact is divided into two partsThe intensity of occurrence of the known fact (s, r, o, τ) is calculated:
since the above equation may obtain negative values, and the probability of occurrence is a positive number of 1 or less, the present invention converts the above occurrence strength into a positive number by an exponential function:
wherein the content of the first and second substances,representing primitive factsThe intensity of occurrence, theta, represents a hyper-parameter,represents the excitation effect of the historical fact on the current fact, tau represents the occurrence time of the current fact, tau i Representing the time of occurrence of the historical event, k (τ - τ) i ) Representing a time decay function.
Thus, the probability p (s, r, o | I (τ)) that each known social fact occurs can be found:
wherein the content of the first and second substances,representing the occurrence intensity of the candidate social facts (e, r, o, τ),representing the occurrence strength of the candidate social facts (s, r, e, τ), e representing any social entity in the set of entities, epsilon representing the set of entities of a social knowledge graph, I (τ) representing the set of historical events before the time τ, s representing the head social entity contained by the current fact, r representing the social relationship contained by the current fact, and o representing the tail social entity contained by the current fact.
In this embodiment, the probability of each known fact occurring is maximized by minimizing a loss function:
a3, inputting the social knowledge graph snapshots of the current dynamic social knowledge graph under each time stamp in a time sequence, calculating corresponding soft modularity for the graph structure of each social knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure, wherein the implementation method comprises the following steps:
a301, inputting in time sequenceCalculating the connection strength between two social entities according to the social knowledge graph snapshot of the front dynamic social knowledge graph under each time stamp
A302, according to the connection strengthCalculating to obtain a soft modularity corresponding to the graph structure of each social knowledge graph snapshot;
A304, according to the community distribution vector of each social entity, maximizing the soft modularity to obtain the evolution loss L of the global structure global 。
In this embodiment, in the process of modeling the community structure of the time-series knowledge graph, considering that different connection strengths may be brought by using different relationship connections between entities, the connection strength between two entities is first calculated according to the following formula:
wherein r represents a setAny of the social relationships in (a) or (b),represents the set of social relationships that exist between social entity i and social entity j at the time stamp τ, Z r A vector representation representing the social relationship r, a parameter vector for measuring the strength of the connection of different relationships,representing a non-linear activation function.
Based on the above, a soft module matrix of the time-series knowledge graph under each time stamp can be obtained, and each element in the matrix is obtained by the following steps:
wherein the content of the first and second substances,andrespectively representing the degree of social entity i and social entity j at the timestamp τ, m τ Representing the total number of relationships that the social knowledge graph exists at the τ timestamp.
In order to maximize the soft modularity of the timing knowledge graph at each timestamp, the present invention requires obtaining a community allocation vector for each social entity. Considering that the entities in the time-series knowledge graph have multiple types, and the same social entity may belong to multiple different communities at the same time, soft community allocation is allowed for the social entity, and community allocation of each entity is obtained through the following formula:
wherein F represents a parameter matrix for mapping the embedded representation of the social entity to a social assignment vector of the social entity,representing an embedded representation of a social entity i under a timestamp tau,and representing the embedded representation corresponding to the community to which the social entity i belongs in the last timestamp.
In this embodiment, the soft modularity of the timing knowledge graph under each timestamp is finally maximized by minimizing the following loss function:
a4, calculating to obtain an overall loss function L of the model according to the evolution loss of the local structure and the evolution loss of the global structure:
L=L local +L global
a5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of social entities and relationships;
and A6, judging whether the model is converged, if so, obtaining the final embedded representation of the social entity and the relationship, and finishing the learning of the representation of the timing sequence knowledge graph, otherwise, returning to the step A1.
Claims (6)
1. A time sequence knowledge graph representation learning method based on co-evolution modeling is characterized by comprising the following steps:
s1, initializing parameters of a model and embedded representation of any social entity and relationship according to a current dynamic social timing knowledge graph to be represented;
s2, inputting known social facts of the current dynamic social timing knowledge graph to calculate according to the sequence of the fact corresponding to the timestamps in the social timing knowledge graph to obtain the occurrence probability of each known social fact, and obtaining the evolution loss of the local structure by maximizing the occurrence probability of the known social facts;
s3, inputting the social timing knowledge graph snapshots of the current dynamic social timing knowledge graph under each timestamp in a time sequence, calculating corresponding soft modularity for the graph structure of each social timing knowledge graph snapshot, and maximizing the soft modularity to obtain the evolution loss of the global structure;
the step S3 includes the steps of:
s301, inputting the current dynamic social timing knowledge graph in time sequenceCalculating the connection strength between two social entities according to the social timing knowledge graph snapshot under each timestamp
S302, according to the connection strengthCalculating to obtain a soft modularity corresponding to the graph structure of each social timing knowledge graph snapshot, wherein each element in the soft modularityThe expression of (a) is as follows:
wherein the content of the first and second substances,andrespectively representing the degree of social entity i and social entity j at the timestamp τ, m τ Representing a total number of relationships that the social timing knowledge graph exists at the τ timestamp;
S304, according to the community distribution vector of each social entity, maximizing the soft modularity to obtain the evolution loss L of the global structure global ;
The connection strength between the two entities in the step S301The expression of (a) is as follows:
wherein r represents a setAny of the social relationships in (a) or (b),represents the set of social relationships that exist between social entity i and social entity j at the time stamp τ, Z r A vector representation representing the social relationship r, a parameter vector for measuring the strength of the connection of different relationships,representing a non-linear activation function;
wherein F represents a parameter matrix for mapping the embedded representation of the social entity to a social assignment vector of the social entity,representing an embedded representation of a social entity i under a timestamp tau,representing an embedded representation corresponding to the community to which the social entity i belongs in the last timestamp;
the evolution loss L of the global structure in the step S304 global The expression of (a) is as follows:
wherein T represents a transposition symbol, m τ Representing the total number of relations existing in the time-sequence knowledge graph at the time stamp tau, tr (-) representing the trace of the matrix, H τ Represents the community allocation matrix at the timestamp tau,representing a soft modularity matrix, norm (·) representing two-norm regularization;
s4, calculating to obtain an overall loss function of the model according to the evolution loss of the local structure and the evolution loss of the global structure;
s5, iteratively optimizing the overall loss function of the model by using a gradient descent method, and updating parameters of the model and embedded expressions of social entities and social relations;
and S6, judging whether the model is converged, if so, obtaining the final embedded representation of the social entity and the social relation, and finishing the learning of the time sequence knowledge graph representation, otherwise, returning to the step S1.
2. The method for learning sequential knowledge graph representation based on co-evolution modeling according to claim 1, wherein the step S1 is to initialize the embedded representation of any social entity e under the timestamp τThe expression of (a) is as follows:
wherein, theta e 、ω e And v e Are all shown asA vector specific to the current social entity.
3. The method for learning sequential knowledge graph representation based on co-evolution modeling according to claim 1, wherein the step S2 comprises the following steps:
s201, inputting the known social facts of the current dynamic social timing knowledge graph according to the sequence of the corresponding timestamps tau of the facts in the social timing knowledge graph, and calculating the spontaneous occurrence intensity of the social facts according to the participants of any known social facts (S, r, o and tau)Wherein the participants are social entities s, o and social relations r contained in known social facts;
s202, utilization occurs in tau i Historical social facts at the moment will serve as incentives for current dynamic social factsThe method is divided into two parts:
wherein eta is s,r (τ i ) And η o,r (τ i ) Respectively representThe head social entity s and the tail social entity o in the current dynamic social fact are at τ i The impact of historical social facts at the time on the current dynamic social facts,is expressed at tau i The set of relationships that social entity e has at the moment,a relationship-level of attention is indicated,and Z r An embedded representation representing a relationship in a historical social fact,representing the relation contained in the historical event, V representing a parameter matrix for measuring the similarity between relation vectors, and h representing tau i Social entity, beta, having a relationship with social entity e at the moment h,x Which is indicative of the attention of the entity,denotes h is at τ i The vector representation at a time instant is,denotes x is at τ i Vector representation at time instant, x represents the entity that social entity e has a relationship in the current dynamic social fact, r' represents τ i At any moment, one of the social relations of the social entity e is given, h' represents one specific to h,denotes τ i A set of entities having a social relationship with the social entity e at the moment,denotes h' at τ i Vector representation at time instant;
s203, strength of spontaneous occurrence according to the social factIncentives with current dynamic social factsDividing the two parts, and calculating the occurrence intensity of the known social facts (s, r, o, tau)
S204, according to the occurrence intensity of the known social facts (S, r, o, tau)Calculating the probability p (s, r, o | I (tau)) of occurrence of each known social fact;
s205, calculating to obtain the evolution loss L of the local structure by maximizing the occurrence probability of the fact according to the occurrence probability of each known social fact local :
Where I (τ) represents the set of historical event components before the time instant τ.
4. The method for learning sequential knowledge graph representation based on co-evolution modeling according to claim 3, wherein the strength of the fact spontaneously occurring in the step S201The expression of (c) is as follows:
wherein the content of the first and second substances,andrepresenting embedded representations of head and tail social entities s and o, respectively, in a social fact under a timestamp, Z r Representing the embedded representation corresponding to the social relationship r, and w representing a learning parameter matrix for measuring the similarity between the vectors.
5. The method as claimed in claim 3, wherein the occurrence intensities of the known facts (S, r, o, τ) in step S203 are determined according to the evolutionary modeling-based sequential knowledge graph representation learning methodThe expression of (a) is as follows:
wherein, the first and the second end of the pipe are connected with each other,representing the original, factually occurring intensity, theta representing the hyper-parameter,represents the excitation effect of the historical fact on the current fact, tau represents the occurrence time of the current fact, tau i Indicating the time of occurrence of the historical event, k (tau-tau) i ) Representing time decayA function.
6. The method for learning representation of time-series knowledge graph based on co-evolution modeling according to claim 3, wherein the expression of the probability p (S, r, oI (τ)) of each known fact occurrence in step S204 is as follows:
wherein the content of the first and second substances,representing the occurrence intensity of the candidate social facts (e, r, o, τ),representing the occurrence strength of the candidate social facts (s, r, e, τ), e representing any social entity in the set of entities, epsilon representing the set of entities of a social timing knowledge graph, I (τ) representing the set of historical events before τ, s representing the head social entity contained by the current fact, r representing the social relationship contained by the current fact, and o representing the tail social entity contained by the current fact.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110305818.4A CN112860918B (en) | 2021-03-23 | 2021-03-23 | Sequential knowledge graph representation learning method based on collaborative evolution modeling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110305818.4A CN112860918B (en) | 2021-03-23 | 2021-03-23 | Sequential knowledge graph representation learning method based on collaborative evolution modeling |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112860918A CN112860918A (en) | 2021-05-28 |
CN112860918B true CN112860918B (en) | 2023-03-14 |
Family
ID=75992217
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110305818.4A Active CN112860918B (en) | 2021-03-23 | 2021-03-23 | Sequential knowledge graph representation learning method based on collaborative evolution modeling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112860918B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113934862B (en) * | 2021-09-29 | 2022-10-14 | 北方工业大学 | Community security risk prediction method, device, electronic equipment and medium |
CN114117064B (en) * | 2021-11-09 | 2023-05-26 | 西南交通大学 | Urban subway flow prediction method based on knowledge dynamic evolution of multi-time granularity |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10795937B2 (en) * | 2016-08-08 | 2020-10-06 | International Business Machines Corporation | Expressive temporal predictions over semantically driven time windows |
US20190018827A1 (en) * | 2017-07-12 | 2019-01-17 | Google Inc. | Electronic content insertion systems and methods |
CN108733792B (en) * | 2018-05-14 | 2020-12-01 | 北京大学深圳研究生院 | Entity relation extraction method |
CN111581396B (en) * | 2020-05-06 | 2023-03-31 | 西安交通大学 | Event graph construction system and method based on multi-dimensional feature fusion and dependency syntax |
CN111723729B (en) * | 2020-06-18 | 2022-08-05 | 四川千图禾科技有限公司 | Intelligent identification method for dog posture and behavior of surveillance video based on knowledge graph |
CN112215435B (en) * | 2020-11-02 | 2023-06-09 | 银江技术股份有限公司 | Urban congestion propagation mode prediction method based on cyclic autoregressive model |
CN112364132A (en) * | 2020-11-12 | 2021-02-12 | 苏州大学 | Similarity calculation model and system based on dependency syntax and method for building system |
-
2021
- 2021-03-23 CN CN202110305818.4A patent/CN112860918B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112860918A (en) | 2021-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zheng et al. | Meta label correction for noisy label learning | |
CN113191484B (en) | Federal learning client intelligent selection method and system based on deep reinforcement learning | |
CN111047085B (en) | Hybrid vehicle working condition prediction method based on meta-learning | |
CN112860918B (en) | Sequential knowledge graph representation learning method based on collaborative evolution modeling | |
CN112699247A (en) | Knowledge representation learning framework based on multi-class cross entropy contrast completion coding | |
CN110244689A (en) | A kind of AUV adaptive failure diagnostic method based on identification feature learning method | |
CN111931814B (en) | Unsupervised countering domain adaptation method based on intra-class structure tightening constraint | |
CN113361685B (en) | Knowledge tracking method and system based on learner knowledge state evolution expression | |
CN111198550A (en) | Cloud intelligent production optimization scheduling on-line decision method and system based on case reasoning | |
CN116402352A (en) | Enterprise risk prediction method and device, electronic equipment and medium | |
Lin et al. | Master general parking skill via deep learning | |
CN112348269A (en) | Time series prediction modeling method of fusion graph structure | |
Xu et al. | Living with artificial intelligence: A paradigm shift toward future network traffic control | |
CN108009635A (en) | A kind of depth convolutional calculation model for supporting incremental update | |
CN114399055A (en) | Domain generalization method based on federal learning | |
CN116501444B (en) | Abnormal cloud edge collaborative monitoring and recovering system and method for virtual machine of intelligent network-connected automobile domain controller | |
CN116484016A (en) | Time sequence knowledge graph reasoning method and system based on automatic maintenance of time sequence path | |
CN114240539B (en) | Commodity recommendation method based on Tucker decomposition and knowledge graph | |
CN113835964B (en) | Cloud data center server energy consumption prediction method based on small sample learning | |
CN115965078A (en) | Classification prediction model training method, classification prediction method, device and storage medium | |
Imani et al. | Hierarchical, distributed and brain-inspired learning for internet of things systems | |
CN112836511B (en) | Knowledge graph context embedding method based on cooperative relationship | |
Papageorgiou et al. | Bagged nonlinear hebbian learning algorithm for fuzzy cognitive maps working on classification tasks | |
CN114880527A (en) | Multi-modal knowledge graph representation method based on multi-prediction task | |
CN114120447A (en) | Behavior recognition method and system based on prototype comparison learning and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |