CN113128689A

CN113128689A - Entity relationship path reasoning method and system for regulating knowledge graph

Info

Publication number: CN113128689A
Application number: CN202110462388.7A
Authority: CN
Inventors: 陈盛; 王新迎; 闫冬; 徐会芳; 彭国政
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2021-04-27
Filing date: 2021-04-27
Publication date: 2021-07-16

Abstract

The method and the system for reasoning the entity relationship path of the regulation and control knowledge graph realize the relationship path reasoning between any two entities in the regulation and control knowledge graph, and can be further applied to regulation and control auxiliary decision making. The method comprises the steps of vectorizing and representing entity pairs in a regulation knowledge graph; and reasoning the vectorized entity pair through a pre-trained DDPG model to obtain the optimal relationship path of the entity pair. The relation path reasoning between any two entities in the regulation knowledge graph is realized by reasoning through the pre-trained DDPG model, and the method can be further applied to regulation auxiliary decision making. The construction of a priori knowledge base is utilized to screen better training samples in the control knowledge graph, so that the quality of the training samples in the experimental playback unit in the DDPG model is improved, and the training effect of the neural network is improved; the path inference method for fusing the prior knowledge and the gradient of the depth certainty strategy is realized.

Description

Entity relationship path reasoning method and system for regulating knowledge graph

Technical Field

The invention relates to the field of power dispatching, in particular to an entity relationship path reasoning method and system for regulating and controlling a knowledge graph.

Background

With continuous expansion of the scale of the power grid and increasingly complex operation characteristics, the complexity of power grid regulation and control services and the bearing capacity strength of regulation and control personnel are increasing day by day. However, with the continuous upgrading of the information system, the quantity of the acquired information and the data dimension are continuously increased, various intelligent applications are more and more, and the dispatching automation business logic and the related business knowledge are difficult to master comprehensively only by manpower. Once a complex problem occurs in the system, when simple business logic or business operation cannot be solved, the possible fault reason can be found only by increasing human input.

The fault knowledge reasoning task is to perform auxiliary logic or decision judgment and can be mainly divided into two methods of relation reasoning based on vectorization representation and random walk path sequencing. The former is represented by TransE, TransH and the like, and the principle is that iterative training is carried out on different entities and connection relations in a knowledge graph in a triple mode, the entities and the relations are further converted into vector representations with different dimensions, and then the relations among the different entities are deduced through calculation of distances among the vectors. The vector embedding expression method converts the expression mode of 'entity-relation-entity' into the addition operation of vectors, can obtain better effect on the basis of moderate scale and large-scale training optimization of the knowledge graph, but when the 'one-to-many' and a large number of repeated relations appear in the knowledge graph, the accuracy of the inference result through vector calculation is greatly influenced. The idea of the path sorting algorithm pra (path Ranking algorithm) proposed by Ni Lao et al is to query the path between any two entities in the knowledge graph, sort by path length, judge the representative as a random walk model by using the characteristics of the relationship path, use the path existing between the two entities as characteristics, and then judge whether a certain relationship exists between the two entities by using the characteristics. This method has better interpretability, but it acts in a discrete feature space, resulting in difficulty in evaluating the similarity between entities and relationships, and search efficiency will be greatly reduced as the size of the knowledge graph is enlarged.

The premise of developing knowledge reasoning decision is to search the relation connection between entities, and relevant scholars at home and abroad make partial research, such as: the DeepPath model proposed by XIONG et al regards inference of relationships between entities as relationship path inference, solves the inference based on an Actor-Critic framework deep reinforcement learning method, and takes a path inference result as a relationship prediction method; in the article of knowledge graph reasoning technology research based on hybrid enhanced intelligence, Yankee et al further propose that a hybrid enhanced intelligence method is adopted for path reasoning, and manual judgment is fused in the training process to improve the training convergence efficiency. Unlike conventional knowledge-graphs, there is a large amount of repetitive relational data in a regulatory knowledge-graph, i.e., one starting entity plus one relation may correspond to n terminal entities. However, as the entity-relationship-entity logarithm increases, the complexity of the connection path also grows exponentially, and how to realize the optimal selection in a selection space which is tens of times greater than the optimal path is a problem to be solved.

Disclosure of Invention

Aiming at the problems that the regulation and control knowledge graph has a large number of repeated relations and dozens of times of path selection in the prior art, the invention provides the entity relation path reasoning method and the system of the regulation and control knowledge graph, which realize the relation path reasoning between any two entities in the regulation and control knowledge graph and can be further applied to the regulation and control auxiliary decision.

The invention is realized by the following technical scheme:

an entity relationship path inference method for regulating knowledge graph includes,

vectorizing and expressing the entity pairs in the regulation and control knowledge graph;

and reasoning the vectorized entity pair through a pre-trained DDPG model to obtain the optimal relationship path of the entity pair.

Preferably, the training method of the pre-trained DDPG model is as follows,

randomly generating a sample selection probability, if the sample selection probability is not greater than a set expert experience threshold, constructing a priori knowledge base according to the entity pair of the regulation knowledge graph, and using the priori knowledge base as a training sample;

if the measured value is larger than the set expert experience threshold value, generating a random sample through a neural network of a DDPG model according to an entity pair in the regulation and control knowledge graph, and taking the random sample as a training sample;

the training samples are collected to an experience playback unit of the DDPG model, and the neural network of the DDPG model is trained according to the set sampling of the training samples in the experience playback unit.

Further, the establishing of the prior knowledge base according to the entity pair of the regulation knowledge graph specifically includes searching all relation paths existing between any two entities in the regulation knowledge graph through depth first, and storing the relation paths in a format of a relation path triple as follows according to training requirements:

P_n＝{(e_head,r,e_m),...,(e_m,r,e_end)}

in the formula, P_nIs the nth relation path; e.g. of the type_head，e_endA head entity and a tail entity; e.g. of the type_mIs an intermediate entity connected to the head entity or the tail entity; and r is the relation in the knowledge graph.

Preferably, when the entity pairs in the regulation and control knowledge graph are expressed in a vectorization manner, any triple < entity 1, relationship, entity 2> in the knowledge graph is converted into a continuous expression of a vector space, that is, an entity vector 1+ a relationship vector ═ entity vector 2 is realized; the method specifically comprises the following steps of,

step 1, determining the dimension n, the entity number | E | and the relation number | R | of a vector space according to a knowledge graph arbitrarily containing a plurality of triples < entity 1, relation and entity 2>, and generating (| E | + | R |). n vector parameters;

step 2, screening triples with fixed sizes, and calculating errors according to entity relationship vectors in the triples, wherein the formula is as follows:

in the formula, vector e_headFor the head entity of a triplet, vector e_endThe vector r is the relation between the connector entity and the tail entity;

step 3, updating vector parameters in the triple in a gradient descending manner;

and 4, circulating the step 2 and the step 3 until the error is minimum, and finishing training to obtain vectorization representation of the entity pair.

Preferably, in the pre-trained DDPG model, the comment network performs network learning according to the following loss function;

wherein y is the target mobile network Q value;

is the Q value of the target review network; r is a reward function; s is a state; a is a relation vector transmitted to a target comment network by a target action network; γ is a discount factor; l (theta) is the square loss of the Q value of the target action network and the Q value of the target comment network; theta is a parameter set of the target mobile network; and E represents an average value.

Further, the reward function R is shown as follows,

R＝R_complete+R_length

wherein the content of the first and second substances,

whether the inference path reaches the reward value of the target point or not;

to infer the reward value for the path length.

Preferably, in the pre-trained DDPG model, the action network updates parameters according to the following formula based on a deterministic strategy;

wherein J is an objective function of the target mobile network; θ is a set of parameters of the target mobile network; s is a state; d is the state space corpus; μ represents a deterministic action of the target mobile network output; q^μ(s, a) is the Q value with deterministic action μ; a is the relation transmitted to the target comment network by the target action network; v is a gradient representation.

An entity relationship path inference system for regulating knowledge graph comprises,

the entity vectorization embedding module is used for vectorizing and expressing the entity pairs in the regulation knowledge graph;

and the relationship reasoning module is used for reasoning the entity pair expressed by the vectorization through a pre-trained DDPG model to obtain the optimal relationship path of the entity pair.

Preferably, the relational inference module further comprises a priori knowledge base construction module, which is used for constructing a priori knowledge base according to the entity pair of the regulation and control knowledge graph; specifically, all relation paths existing between any two entities in the knowledge graph are searched and adjusted through depth-first search, and the relation paths are stored as the following relation path triple format according to training requirements:

P_n＝{(e_head,r,e_m),...,(e_m,r,e_end)}

A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of entity relationship path inference of a regulatory knowledgegraph as claimed in any preceding claim.

Compared with the prior art, the invention has the following beneficial technical effects:

the invention realizes the relationship path reasoning between any two entities in the regulation knowledge graph by reasoning through the pre-trained DDPG model, and can be further applied to the regulation auxiliary decision. The construction of a priori knowledge base is utilized to screen better training samples in the control knowledge graph, so that the quality of the training samples in the experimental playback unit in the DDPG model is improved, and the training effect of the neural network is improved; the path inference method for fusing the prior knowledge and the gradient of the depth certainty strategy is realized.

Furthermore, the neural network learning can realize the complex relation mapping of the state space and the action output without establishing a mathematical relation between the environment state and the action output, so that the adaptability of the model is improved, the neural network has a memory function, the path reasoning can be carried out without further training in the actual use after the training, and the efficiency of the conventional query-based path reasoning method is improved.

Furthermore, the neural network parameter updating mode is developed by combining empirical knowledge and random exploration, so that the exploration efficiency of the neural network model can be greatly improved, and the rapid and stable convergence of the model is realized.

Furthermore, the reward function is designed by a dual mechanism of reachability and inverse proportion of path length, so that the usability of the path can be guaranteed, the inference path which is as short as possible can be realized, and the usability of the model is improved.

Drawings

FIG. 1 is a flow chart of the inference method described in the examples of the present invention.

FIG. 2 is a flow chart of an inference method including a pre-training process as described in the examples of the present invention.

FIG. 3 is an architecture diagram of the DDPG model training in the present example.

FIG. 4 is a block diagram of the system in an example of the invention.

Detailed Description

The present invention will now be described in further detail with reference to specific examples, which are intended to be illustrative, but not limiting, of the invention.

The invention relates to an entity relationship path inference method of a regulation knowledge graph, which comprises the steps of vectorizing and expressing entity pairs in the regulation knowledge graph as shown in figure 1; and reasoning the vectorized entity pair through a pre-trained DDPG model to obtain the optimal relationship path of the entity pair.

The above pre-trained DDPG model is an improvement of the existing DDPG model, mainly realized by an improved training method, as shown in FIG. 2, comprising the following steps,

s1 vectorizing and representing the entity pairs in the regulation knowledge graph; the vectorization expression of entities and relations in the knowledge graph is mainly realized through entity vectorization embedding; when in representation, any triple < entity 1, relation and entity 2> in the knowledge graph is converted into continuous representation of a vector space, namely, an entity vector 1+ a relation vector is realized as an entity vector 2; the method specifically comprises the following steps of,

in the formula, vector e_headBeing heads of tripletsEntity, vector e_endThe vector r is the relation between the connector entity and the tail entity;

For different mode selections, processing objects of entity vectorization embedding in the step S1 are different, during training, a regulation knowledge graph training set is specifically adopted, during testing, a regulation knowledge graph testing set is specifically adopted, and during running, a regulation knowledge graph set to be inferred is specifically adopted.

S2 relationship reasoning training; the original DDPG model training is improved mainly through the construction of a priori knowledge base, the improvement of deep neural network learning and the updating of action network parameters, so that the improvement of the DDPG model is realized, and particularly, as shown in figure 3,

starting a training cycle, randomly generating a sample selection probability, and if the sample selection probability is not greater than a set expert experience threshold, constructing a priori knowledge base according to the entity pair of the regulation knowledge graph and taking the priori knowledge base as a training sample; if the measured value is larger than the set expert experience threshold value, generating a random sample through a neural network of a DDPG model according to an entity pair in the regulation and control knowledge graph, and taking the random sample as a training sample; one of the complete samples includes a relation vector a, a current time state s, and a next time state s_i+1And a reward function R_i. In the preferred embodiment, the expert experience threshold is set to 0.5 by default.

Specifically, the construction of the prior knowledge base comprises the steps of searching and adjusting all relation paths existing between any two entities in the knowledge graph through depth first, and storing the relation paths in the following relation path triple format according to training requirements:

P_n＝{(e_head,r,e_m),...,(e_m,r,e_end)}

in the formula, P_nIs the nth relation path; e.g. of the type_head，e_endA head entity and a tail entity;e_mis an intermediate entity connected to the head entity or the tail entity; and r is the relation in the knowledge graph.

When the DDPG model is pre-trained, the comment network carries out network learning according to the following loss function;

wherein y is the target mobile network Q value;

is the Q value of the target review network; r is a reward function; a is a relation vector transmitted to a target comment network by a target action network; γ is a discount factor; l (theta) is the square loss of the Q value of the target action network and the Q value of the target comment network; theta is a parameter set of the target mobile network; and E represents an average value. Specifically, a is a relation vector transmitted to the target comment network by the target action network, namely, an execution strategy, y is equal to the sum of the Q value of the target comment network multiplied by a discount factor gamma, and the Q value of the target comment network is calculated according to the reward function R and the action obtained by the state at the next moment and the target action network. And L (theta) is the square loss of the Q value of the target action network and the Q value of the target comment network and is used for training and updating the parameters of the target comment network.

Among the above loss functions, the reward function R is designed mainly considering the essential factors in path inference: firstly, the inference path can reach a target point; secondly, the length of the inference path, the specific reward function R is shown as the following formula,

R＝R_complete+R_length

wherein the content of the first and second substances,

award for reasoning whether path reaches target pointAn excitation value;

to infer the reward value for the path length.

When a DDPG model is pre-trained, an action network updates parameters according to the following formula based on a certainty strategy;

wherein J is an objective function of the target mobile network; θ is a set of parameters of the target mobile network; s is a state; d is the state space corpus; μ represents a deterministic action of the target mobile network output; q^μ(s, a) is the Q value with deterministic action μ; a is the relation transmitted to the target comment network by the target action network;

is represented as a gradient.

And outputting the training result of the current round through the training, judging whether the cycle is finished, storing and updating the neural network model when the cycle is finished, and repeating the training cycle when the cycle is not finished.

S3, after training and updating the neural network of the DDPG, carrying out relational reasoning operation or test;

step 1: inputting entity pair needing relationship inference<e_head，e_end>；

Step 2: using the S2 trained DDPG model to couple the entities<e_head，e_end>Representing a neural network of the input DDPG model by vectorization;

and step 3: and obtaining an output result or verifying the output result.

The invention provides a path reasoning method based on the combination of priori knowledge and a depth certainty strategy gradient, which realizes the relationship path reasoning between any two entities in a regulation knowledge map and can be further applied to regulation auxiliary decision making. In contrast, the system of the present invention, as shown in fig. 4, includes,

The relational inference module also comprises a priori knowledge base construction module which is used for constructing a priori knowledge base according to the entity pair of the regulation knowledge graph; specifically, all relation paths existing between any two entities in the knowledge graph are searched and adjusted through depth-first search, and the relation paths are stored as the following relation path triple format according to training requirements:

P_n＝{(e_head,r,e_m),...,(e_m,r,e_end)}

The relation reasoning module is used for relation reasoning training and relation reasoning operation or test.

The entity vectorization embedding module mainly realizes vectorization representation of entities and relations in the knowledge graph.

When the relational reasoning module is used for training, the functions of constructing a priori knowledge base through the priori knowledge construction module, updating parameters of the deep neural network after the deep neural network is constructed, designing and constructing a reward function through a reward mechanism, training a model, storing the model and the like are mainly realized.

And when the relation reasoning module is operated or tested, the relation path reasoning of the knowledge graph entity is mainly carried out based on the trained model.

Specifically, the neural network is constructed, a state space s is defined as 2 × n parameters according to the adjustment knowledge graph entity and the relation vector embedding dimension n, the number of neurons in the corresponding input layer is 2 × n, the action space is the number of relations in the knowledge graph, and if the number of relations is m, the number of neurons in the corresponding output layer is m. And selecting the number of hidden layers and the number of neuron parameters according to the size of the scale of the knowledge map. After the construction is completed, the training process is performed.

The present invention also provides a computer apparatus comprising a memory for storing a computer program; and the processor is used for implementing the steps of the entity relationship path inference method for regulating the knowledge graph when the computer program is executed.

The present invention also provides a computer readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the method for entity relationship path inference for regulating a knowledge-graph as described above.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, those skilled in the art can make modifications and equivalents to the embodiments of the present invention without departing from the spirit and scope of the present invention, which is set forth in the claims of the present application.

It will be appreciated by those skilled in the art that the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The embodiments disclosed above are therefore to be considered in all respects as illustrative and not restrictive. All changes which come within the scope of or equivalence to the invention are intended to be embraced therein.

Claims

1. A reasoning method for regulating entity relationship path of knowledge graph includes,

2. The method for regulating and controlling inference of entity relationship paths of a knowledge-graph according to claim 1, wherein the pre-trained DDPG model is trained as follows,

3. The method for reasoning relationship paths of entities of a regulation and control knowledge graph according to claim 2, wherein the building of the prior knowledge base according to the entity pairs of the regulation and control knowledge graph specifically comprises searching all relationship paths existing between any two entities in the regulation and control knowledge graph through depth first, and storing the relationship paths in a relationship path triple format as follows according to training requirements:

P_n＝{(e_head,r,e_m),...,(e_m,r,e_end)}

4. The method for reasoning entity relationship paths of the regulation and control knowledge graph according to claim 1 or 2, wherein when an entity pair in the regulation and control knowledge graph is vectorized, any triple < entity 1, relationship, entity 2> in the knowledge graph is converted into a continuous representation of a vector space, that is, an entity vector 1+ a relationship vector ═ entity vector 2 is realized; the method specifically comprises the following steps of,

5. The method for reasoning entity relationship paths of a regulatory knowledge graph according to claim 1 or 2, wherein in the pre-trained DDPG model, a comment network performs network learning according to the following loss function;

wherein y is the target mobile network Q value;

is the Q value of the target review network; r is a reward function; s is a state; a is a relation vector transmitted to a target comment network by a target action network; γ is a discount factor; l (theta) is the square loss of the Q value of the target action network and the Q value of the target comment network; theta is the target mobile networkA set of parameters of a complex; and E represents an average value.

6. The method for inference on entity relationship paths through regulation of knowledge-graphs according to claim 5, wherein said reward function R is represented by the following formula,

R＝R_complete+R_length

wherein the content of the first and second substances,

whether the inference path reaches the reward value of the target point or not;

to infer the reward value for the path length.

7. The method for regulating and controlling inference of entity relationship paths of a knowledge graph according to claim 1 or 2, wherein in the pre-trained DDPG model, the action network performs parameter update according to the following formula based on a deterministic strategy;

▽J(θ)＝E_s∈D[▽_μQ^μ(s,a)·▽_θμ_θ(s)|a＝μ_θ(s)]

8. An entity relationship path inference system for regulating knowledge graph is characterized by comprising,

9. The system for regulating and controlling the inference of entity relationship paths of knowledge-graphs according to claim 8, wherein said relationship inference module further comprises a priori knowledge base construction module for constructing a priori knowledge base according to the entity pairs of the regulating and controlling knowledge-graphs; specifically, all relation paths existing between any two entities in the knowledge graph are searched and adjusted through depth-first search, and the relation paths are stored as the following relation path triple format according to training requirements:

P_n＝{(e_head,r,e_m),...,(e_m,r,e_end)}

10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the entity relationship path inference method of a regulation knowledgegraph of any one of claims 1 to 7.