CN116127084A

CN116127084A - Knowledge graph-based micro-grid scheduling strategy intelligent retrieval system and method

Info

Publication number: CN116127084A
Application number: CN202211298737.7A
Authority: CN
Inventors: 牛焕娜; 窦伟; 李宗晟; 薛佳炜; 井天军; 王江波
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2022-10-21
Filing date: 2022-10-21
Publication date: 2023-05-16

Abstract

The invention relates to a micro-grid dispatching strategy intelligent retrieval system and method based on a knowledge graph. The invention constructs a knowledge map in the micro-grid operation scheduling field by means of natural language processing technologies such as named entity recognition and the like related to the knowledge map and combining with artificial intelligence technologies such as machine learning, particularly deep learning and the like, builds a micro-grid scheduling strategy retrieval system architecture based on the knowledge map by utilizing related micro-grid scheduling documents and data, and formulates a system working principle process and key technical links; according to the obtained real-time running data of each micro-grid, the current running state of the micro-grid is mapped to the entity and the relation in the knowledge graph of the micro-grid dispatching field through links such as logic judgment, content word segmentation, entity identification and the like, the dispatching strategy under the current running state of the micro-grid is generated through retrieval, and the result is fed back to a dispatcher corresponding to the dispatching strategy.

Description

Knowledge graph-based micro-grid scheduling strategy intelligent retrieval system and method

Technical Field

The invention relates to the field of micro-grid operation regulation, in particular to a micro-grid scheduling strategy intelligent retrieval system based on a knowledge graph.

Background

With the increase of the permeability of renewable energy sources, micro-grids are rapidly developing with their flexible, intelligent and compatible features. The energy complementary operation can be realized through the internal distributed resource regulation and control of the micro-grid, so that the operation cost is reduced, and the renewable energy consumption level and the system operation stability are improved. In order to meet the requirements of efficient and reliable operation of the micro-grid system with increasingly complex current operation scenes and configuration structures, a reliable and universal micro-grid dispatching and control strategy making method with short decision time needs to be researched, and economic benefit and regulation performance of the micro-grid are improved. The conventional method for formulating a scheduling strategy based on a physical mechanism analysis by establishing an optimized scheduling model and adopting an optimized algorithm accumulates rich operation scene sets and scheduling strategies corresponding to the operation scene sets after implementing scheduling to enable the micro-grid system to operate for a sufficient period, and the data scenes and the scheduling strategies corresponding to the operation scene sets can provide a scheduling rule base which can be compared with expert knowledge for formulating a future periodic scheduling strategy, so that the micro-grid scheduling decision efficiency can be effectively improved.

With the development of artificial intelligence technology in recent years, the intelligent regulation performance of the micro-grid is improved by utilizing a deep learning method by means of a data driving thought, and the method has the potential technical advantage of 'de-modeling', and becomes a new thought for formulating a regulation strategy of the micro-grid. The knowledge graph is used as a semantic network for revealing the relation between entities, and can describe things in the real world and the interrelationships thereof in the form of triples to form a net-shaped knowledge structure. Compared with the traditional knowledge organization and management mode, the data organization structure of the knowledge graph based on the graph supports more efficient data retrieval, and can process complex and diverse associated representations. Therefore, by constructing the domain knowledge graph of the micro-grid scheduling strategy, organizing and storing scheduling knowledge in the form of a graph, carrying out semantic search and auxiliary decision making by utilizing a computer, helping scheduling staff comprehensively and quickly grasp key information of the running state of the micro-grid, and providing intelligent information service and application for scheduling strategy formulation has wide prospect.

In the aspect of micro-grid scheduling strategy formulation, most of the existing researches are based on physical mechanism analysis, and scheduling strategies are formulated by establishing an optimal scheduling model and adopting a certain optimization algorithm. According to the traditional micro-grid scheduling strategy formulation method, with the development of new energy and distributed power sources, the uncertainty of micro-grid operation and user power consumption is enhanced, the situation is more and more complex, and the problems of complex redundancy of solving processes, dimension disaster, low optimization calculation efficiency, easiness in sinking into local optimal solutions and the like exist in the traditional mathematical modeling method based on a physical system, so that the real-time requirement of the current micro-grid operation is difficult to meet. On the other hand, in the informatization engineering application aiming at power dispatching, a plurality of application systems are intensively developed by power companies, knowledge engineering technology, particularly an expert system framework is introduced, and the improvement from data to knowledge is realized. The method has the advantages that the integral characteristics of the source load storage are subjected to cluster analysis through a large amount of historical data, and learning and simulation are performed through power grid regulation and control operation experience and regulations, so that the method is a strong item of the artificial intelligence technology such as the current knowledge graph. Therefore, the design of the future micro-grid regulation system is expanded from mathematical modeling analysis to a mode of combining mathematical modeling and knowledge driving, and finally evolves into knowledge guidance.

The domain knowledge graph is gradually applied to a plurality of fields or industries such as medical treatment, finance and the like according to the characteristics of huge data scale, rich semantic relation, excellent quality, friendly structure and the like. Research and application in the power industry is mainly focused on aspects of fault maintenance, intelligent customer service and the like, and is not applied to the field of micro-grid operation scheduling strategy formulation. If the new knowledge engineering technology of the knowledge graph can be introduced into the formulation of the micro-grid dispatching strategy, the concepts, entities, events and the relation among the concepts, the entities and the events in the micro-grid dispatching can be characterized in a structural mode, and then the formulation of the micro-grid dispatching strategy is completed according to the perceived running state, and a practical solving method can be provided for the dispatching problem of the micro-grid running.

Disclosure of Invention

Aiming at the defects existing in the prior art, the invention aims to provide a micro-grid scheduling strategy intelligent retrieval system based on a knowledge graph. The invention constructs a knowledge map in the micro-grid operation scheduling field by means of natural language processing technologies such as named entity recognition and the like related to the knowledge map and combining with artificial intelligence technologies such as machine learning, particularly deep learning and the like, builds a micro-grid scheduling strategy retrieval system architecture based on the knowledge map by utilizing related micro-grid scheduling documents and data, and formulates a system working principle process and key technical links; according to the obtained real-time running data of each micro-grid, the current running state of the micro-grid is mapped to the entity and the relation in the knowledge graph of the micro-grid dispatching field through links such as logic judgment, content word segmentation, entity identification and the like, the dispatching strategy under the current running state of the micro-grid is generated through retrieval, and the result is fed back to a dispatcher corresponding to the dispatching strategy.

In order to achieve the above purpose, the invention adopts the following technical scheme:

the micro-grid dispatching strategy intelligent retrieval system based on the knowledge graph is characterized by comprising an information analysis module, a micro-grid state judgment module and a micro-grid dispatching field knowledge graph and dispatching strategy retrieval module;

the information analysis module calculates and logically judges real-time operation data of each micro-grid uploaded to the retrieval system, and matches the judging result with entities and relations in the micro-grid dispatching field knowledge graph;

the micro-grid state judging module judges and confirms the state of the micro-grid to be searched according to the entity matching result of the information analyzing module, and generates a search chart; the matching result comprises a regulation and control target and an energy storage condition;

the scheduling strategy retrieval module utilizes a knowledge calculation engine to find a matched knowledge path in a knowledge graph in the micro-grid scheduling field to obtain a final retrieval result;

the data in the knowledge graph in the micro-grid dispatching field comprises analyzed structured data, marked semi-structured data and unstructured data; wherein the structured data is from a grid dispatching system;

the marked semi-structured data and unstructured data are divided into data entities and data entity relations, and the semi-structured data and the unstructured data specifically comprise a micro-grid operation mode, a micro-grid dispatching strategy, a micro-grid regulation target and a dispatching principle.

On the basis of the scheme, the steps of storing the parsed structured data into the knowledge graph in the micro-grid dispatching field are as follows:

step 1, connecting a database for initializing operation;

step 2, constructing SQL sentences and carrying out data query;

step 3, converting data types, structures and attributes;

step 4, judging whether the data exist in the database, if yes, returning to the step 2, otherwise, storing the data in the step 5;

step 5, constructing a data storage statement, determining a context by combining information extracted by the SQL statement, and creating a node;

step 6, judging whether the SQL sentence is queried completely, if so, exiting the extraction flow, and if not, returning to the step 2, and continuing to construct the SQL sentence to query the data;

the structured data includes: micro-grid load data, wind/light renewable energy generation power and energy storage charge state;

the database is a Neo4j database; the data storage statement is a Neo4j data storage statement.

Based on the scheme, the data entity uses a BiLSTM-CRF model for identification and attribute extraction, and the specific method comprises the following steps:

step 1, word segmentation is carried out on a micro-grid dispatching strategy text by using jieba, word2vec of a genesim tool kit is used for training to obtain a word vector matrix, the text to be recognized is mapped by using the trained word vector matrix, and a word vector sequence x, x= { x is formed ₁ ，x ₂ ，...，x _n X, where x _t An input vector representing a t-th word, where t=1, 2,3 … n;

step 2, taking the word vector sequence x obtained in the step 1 as the input of a forward LSTM layer, and taking the reverse sequence of x as the input of a backward LSTM layer; at time t, hidden state sequence h output by forward LSTM layer _R Hidden state sequence h output from backward LSTM layer _L Splicing according to the position _t ＝[h _R ；h _L ]∈R ^m To obtain a complete hidden state sequence (h ₁ ，h ₂ ，…，h _n )∈R ^n×m M is the hidden state vector dimension, as shown in the following:

h _R ＝f(Wx _t +Uh _t-1 +b) (1)；

h _L ＝f(Wx _t +Uh _t+1 +b) (2)；

where f () is a nonlinear activation function, w= (W) ₁ ，w ₂ ，...，w _n ) ^T For the state-input weight matrix, u= (U) ₁ ，u ₂ ，...，u _n ) ^T Is a state-state weight matrix, x _t For the input of the current moment of time,

for vector concatenation operation, h _t The complete hidden state sequence spliced at the time t represents the external state at the current time, and h _t-1 H is the external state of the last moment _t+1 B is a bias value for the external state at the next moment;

step 3, transmitting the complete vector sequence to an output layer, mapping m-dimensional vectors to k-dimensional vectors, wherein k represents all label numbers in the labeling set, mapping the label numbers into an n multiplied by k-dimensional feature matrix P, and outputting the n multiplied by k-dimensional feature matrix P, wherein P= (P) ₁ ，p ₂ ，...，p _n )∈R ^n×k Then p _i ∈R ^k Each bit P of (2) _ij For the word x _i Scoring values categorized into jth tags;

step 4, inputting the matrix P obtained in the step 3 into a CRF model, learning labeling rules among labels by the CRF model to calculate scores, and outputting an optimal label sequence; the following formula is shown:

In the above formula; s (x, y) is the input sequence x= { x ₁ ，x ₂ ，...，x _n Predicting the tag via CRF layer to be equal to the fraction of tag sequence y, tag sequence y= { y ₁ ，y ₂ ，...，y _n }；

For state transition matrix score values in CRF model, representingEach element in the state transition matrix M is from y _i Change to y _i+1 Is to be used as a potential for a vehicle; />

For the word x _i Classification into y _i Score value of label

Normalizing the formula (4) by using a Softmax function to obtain a model probability formula, wherein the model probability formula is shown in the following formula (5):

in the above formula, P (y|x) is the probability value of the input sequence x classified into the tag sequence Y, Y 'represents one possible tag sequence, Y' ∈y (x), Y (x) represents all possible tag sequences, Σ _y′∈Y(x) exp (s (x, y')) represents the sum of all label sequence scores, and the y with the largest output probability value is the final label sequence.

Based on the scheme, the data entity relationship uses a BiGRU-attribute model to extract and classify the data entity relationship, and the specific method comprises the following steps:

step 1, word segmentation is carried out on a micro-grid dispatching strategy text by using jieba, word2vec of a genesim tool kit is used for training to obtain a word vector matrix, the word vector matrix is used for mapping a text to be recognized, and a word vector sequence x, x= { x is obtained ₁ ，x ₂ ，...，x _n X, where x _i An input vector representing an i-th word, where i=1, 2,3 … n;

step 2, taking the word vector sequence x obtained in the step 1 as the input of a forward GRU layer, and taking the reverse sequence of x as the input of a backward GRU layer; at time t, hidden state sequence h output by forward GRU layer _R Hidden state sequence h output by backward GRU layer _L Splicing according to the position _t ＝[h _R ；h _L ]∈R ^m To obtain a complete hidden state sequence (h ₁ ，h ₂ ，…，h _n )∈R ^n×m M is a hidden state vector dimension, as shown in formulas (1) - (3);

step 3,Each word is distributed with different weights in the attention layer, the obtained attention moment arrays are connected, and the output representation C of the attention layer is calculated _t The following formula (6) shows:

wherein->

In the above formula, N represents sentence length, h _t Indicating the external state at the current time, a _t Represents the weight at time t, W ^t As a weight matrix, b ^t Is a bias term;

step 4, C obtained in the step 3 _t The vector learned by the full-connection layer module is input into the probability of predicting the category relation by the Softmax layer, and the probability is shown in the following formulas (7) and (8):

y(y|x)＝softmax(W ^s C _t +b ^s ) (7)

y＝arg max _y y(y|x) (8)

in the above formula: y is the relation category, namely the relation prediction result of the final solution; w (W) ^s Weight parameters learned for the classifier; b ^s Bias terms for the classifier.

On the basis of the scheme, the construction method of the knowledge graph in the micro-grid dispatching field comprises the following steps:

firstly, realizing identification and attribute extraction of a micro-grid dispatching entity by using a BiLSTM-CRF model; secondly, according to the dispatching characteristics of the micro-grid, carrying out relation extraction and classification by using a BiGRU-Attention model, and obtaining a knowledge graph of the dispatching field of the micro-grid and storing the knowledge graph in a Neo4j graph database after data entity linking and knowledge fusion;

disambiguation and co-resolution of the knowledge fusion finger entity; the main task of coreference resolution is to find out synonyms among all words representing entities/attributes, and the process is as follows:

step 1, firstly finding out the end of a sentence with default nouns and the end of an entity by using a regular expression, and then determining the boundary of the default data entity according to the recognition result of the named data entity to complement the data entity, wherein the following formula (9) is as follows:

[_4e00\u9fa 5 ]) ()? (or)? (and)? (sum)? [_4e00\u9fa 5] + (9);

step 2, classifying according to parts of speech: dividing all words representing entities/attributes into a plurality of sets according to parts of speech such as verbs, nouns, adjectives and the like, and respectively carrying out synonym recognition on each set;

step 3, vectorization: training a defect record corpus by a word2vec method for describing semantic similarity among words representing entities/attributes, selecting word vector dimensions as 100 dimensions to obtain word vectors corresponding to all words in the corpus, and judging the similarity among words representing the entities/attributes by calculating cosine similarity among the word vectors;

Step 4, screening word pairs: deleting word pairs with high cosine similarity which appear in the same defect record, removing adjacent word pairs, and screening out co-located word pairs;

step 5, forming a synonym table: combining the same word pairs into a synonym set, forming a plurality of synonym sets, selecting one word from each set as the standardized names of all words in the set, and finally representing the synonym sets in the form of a synonym table.

The invention further aims to provide a micro-grid scheduling strategy intelligent retrieval method based on a knowledge graph.

the intelligent retrieval method of the micro-grid dispatching strategy based on the knowledge graph is characterized by comprising the following steps of:

step 1: generating a search map Q according to the state corresponding to the current running data of the micro-grid;

step 2: dividing the search map Q into a plurality of sub-search maps;

step 3: executing the sub-search in the step 2 in the knowledge graph to obtain the results of all sub-searches;

step 4: and (3) connecting the sub-search results obtained in the step (3) to generate a matching sub-graph, namely a final search result.

Wherein q= (E _Q ，R _Q ) Set E containing nodes _Q And the collection of edges R _Q The method comprises the steps of carrying out a first treatment on the surface of the E above _Q Corresponding to the node in (a) and R _Q The edges in (a) represent the relationship between any two nodes;

mapping the knowledge graph g= (E _G ，R _G ) The subgraph satisfying the mapping function F is defined as a matching subgraph φ (Q), i.e., φ (Q) will node E in Q _Q Mapping to node phi (E in G _G ) Edge R in Q _Q Mapping to edges phi (R in G _G ) The method comprises the steps of carrying out a first treatment on the surface of the The mapping function F refers to the mutual correspondence relationship of elements between the knowledge graph and the matching subgraph; e above _G R is the set of nodes in the knowledge graph G _G Is the collection of edges in the knowledge graph G; e above _G Corresponding to the node in (a) and R _G The edges in (a) represent the relationship between any two nodes

On the basis of the above-mentioned scheme,

the step of dividing sub-search in the step 2 is as follows: dividing the search graph Q into two layers of tree structures, wherein each sub-search graph comprises a root node, a layer of sub-nodes and edges;

step 3, performing sub-search in the knowledge graph, specifically: decomposing the sub-search map into a minimum spanning tree, and preferentially matching edges of the minimum spanning tree when matching the edges; selecting a root node as a vertex with strong filtering capacity of preferential matching; and on the basis of the VF2 graph matching algorithm, sub-search execution is carried out by combining the label characteristics of the graph.

Based on the above scheme, the step 4 specifically includes:

step 4-1, initializing the sub-search result set C, and dividing sub-search Q _i ∈(Q ₁ 、Q ₂ ……Q _n ) Executing all Q's according to sub-search execution method _i Obtaining the results of all sub-searches;

step 4-2, carrying out hash connection on the search results of n sub-searches, storing the results with the matching degree meeting the threshold gamma into C, and sorting the results according to the matching degree;

and 4-3, returning a search result set C to finish the search.

The intelligent retrieval system and method for the micro-grid dispatching strategy based on the knowledge graph have the beneficial effects that:

the method can well utilize a large amount of heterogeneous operating data of multiple sources accumulated by the micro-grid system, including numbers, characters and the like, and provide a scheduling rule base which is comparable with expert knowledge for the formulation of a future periodic scheduling strategy. When the micro-grid dispatching strategy is made, compared with the traditional semantic analysis method, the retrieval method based on the knowledge graph can more accurately identify the operation information of the micro-grid and return a more comprehensive and accurate dispatching strategy retrieval result, so that the dispatching decision efficiency of the micro-grid is improved, and the intelligent level of the operation dispatching of the micro-grid is improved.

Drawings

The invention has the following drawings:

FIG. 1 is a flow chart for constructing a knowledge graph in the micro-grid dispatching field;

FIG. 2 is a schematic diagram of a model for identifying unstructured data BiLSTM-CRF named entities in the field of micro-grid dispatching;

FIG. 3 is a diagram of a BiGRU-Attention relationship identification model architecture in the micro-grid dispatching field;

fig. 4 is an entity and relationship diagram of a micro-grid dispatching domain knowledge graph stored in a Neo4j graph database;

FIG. 5 is a schematic diagram of a micro-grid scheduling policy retrieval system architecture;

fig. 6 is a search map Q of the micro grid a at the current time _A ；

Fig. 7 is a sub-search pattern Q of the micro grid a _A1 、Q _A2 、Q _A3 、Q _A4 。

FIG. 8 is a sub-search flow chart;

fig. 9 is a sub-search result connection.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

1. Deep learning-based knowledge graph construction in micro-grid dispatching field

The construction of the knowledge graph in the micro-grid dispatching field is divided into a mode layer construction and a data layer construction, wherein the mode layer is a knowledge organization framework of the knowledge graph and is a data model for describing entities in the field, relationships among the entities and attributes. The invention refines meaningful concept types, related attributes and relations among concepts in the micro-grid dispatching field, thereby forming a field knowledge system. As shown in fig. 1, the mode layer of the real-time micro-grid dispatching knowledge graph is composed of 4 core elements of dispatching targets, dispatching strategies, sub-network states (mainly referring to a state space formed by the conditions of internal power consumption, operation time period and the like of the micro-grid), operation states (mainly referring to the states of grid connection or island and the like of the micro-grid) and correlations among the 4 core elements; the data layer construction mainly comprises knowledge extraction, knowledge fusion, knowledge reasoning and knowledge storage. The data sources are mainly composed of structured data from the scheduling system, semi-structured and unstructured data from scheduling rule reports and related literature materials.

(1) BiLSTM-CRF-based entity identification in micro-grid scheduling field

One basic task of knowledge extraction is named entity recognition, which is mainly solved by a sequence labeling method. The invention defines 7 entity categories of Load condition (Load), running State (State), sub-network State (Space), scheduling policy (structure), power supply condition (Power), regulation Target (Target) and energy Storage State (Storage) together. By adopting a BIO labeling method, the beginning of an entity is labeled by B, the rest is labeled by I, O represents a non-entity, and the following table is a labeling example.

Table 1 BIO entity labeling instance in micro grid scheduling field

For structured data which is derived from a power grid dispatching system in real time, a regularized extraction mode is adopted to directly generate entity-relation-entity triples to store the knowledge graph, and the specific flow is as follows: 1) Connecting a database to perform initialization operation; 2) Constructing SQL sentences and carrying out data query; 3) Data type, structure, attribute conversion; 4) Judging whether the data exist in the Neo4j database, if yes, returning to the step 2, otherwise, performing data storage of the step 5. The structured data includes: micro-grid load data, wind/light renewable energy generation power, stored energy charge state and the like. 5) Constructing Neo4j data storage sentences, determining upper and lower relationship by combining information extracted by SQL sentences, and creating nodes; 6) Judging whether the SQL sentence is queried completely, if so, exiting the extraction flow, and if not, returning to the step 2, and continuing to construct the SQL sentence to query the data.

For semi-structured and unstructured texts such as scheduling treatment management regulations, scheduling regulations and scheduling personnel business experience rules, the invention provides a micro-grid scheduling field named entity recognition model based on a two-way long-short-term memory neural network-conditional random field (Bidirectional Long Short Term Memory-Conditional Random Field, biLSTM-CRF), and the model architecture is divided into three layers, namely an input layer, an implicit layer (BiLSTM layer) and a labeling layer (CRF layer), as shown in figure 2. The unstructured text data includes: a micro-grid operation mode, a micro-grid dispatching strategy, a micro-grid regulation target, a dispatching principle and the like.

1) The input layer is mainly responsible for vectorizing the words of the window. The method comprises the steps of segmenting texts in data such as micro-grid dispatching strategy research documents by using jieba, and training by using word2vec of a genesim tool kit to obtain a word vector matrix. The text to be recognized can be mapped by using the trained word vector matrix through the input layer to form a group of word vector sequences x, x= { x formed by word embedding ₁ ，x ₂ ，...，x _n X, where x _t Representing the input vector of the t-th word. The word vector sequence is used as an initial input value of the BiLSTM network.

2) BiLSTM consists of two layers of long and short term memory (Long Short Term Memory, LSTM) neural networks in the forward and backward directions, with the same information input for both LSTM and information transfer in opposite directions. In the invention, the sequence { x ] of the input vector sequence formed by the related text in the data such as the micro-grid dispatching strategy research literature ₁ ，x ₂ ，...，x _n As input to the forward LSTM layer, the reverse sequence { x } _n ，x _n-1 ，...，x ₁ Then as input to the backward LSTM layer. The front and back LSTM are respectively time sequence and time reverse sequence, the model inputs the word vector of a word at each time step, so the word vector x of the word at the t-th time step is input as the sentence t-th word _t The hidden states at time t are defined as = { h, respectively _R1 ，h _R2 ，…，h _Rn "Forward LSTM) and h _L ＝{h _L1 ，h _L2 ，…，h _Ln -backward LSTM). At time t, the model outputs a hidden state sequence h of forward LSTM _R Hidden state sequence h output with backward LSTM _L Splicing according to the position _t ＝[h _R ；h _L ]∈R ^m To obtain a complete hidden state sequence (h ₁ ，h ₂ ，…，h _n )∈R ^n×m M is the hidden state vector dimension. Then the formula is as follows:

h _R ＝f(Wx _t +Uh _t-1 +b) (1)；

h _L ＝f(Wx _t +Uh _t+1 +b) (2)；

for vector concatenation operation, h _t The complete hidden state sequence spliced at the time t represents the external state at the current time, and h _t-1 H is the external state of the last moment _t+1 B is a bias value for the external state at the next time.

The complete vector sequence is then passed onTo the output layer of the BiLSTM model, the output sequence is subjected to matrix change, and the m-dimensional vector is mapped to the k-dimensional vector by the one-to-one correspondence between the dimension and the length of the labeling set, wherein k represents all label numbers in the labeling set and is mapped into n multiplied by k-dimensional feature matrices P, P= (P) ₁ ，p ₂ ，...，p _n )∈R ^n×k Then p _i ∈R ^k Each bit P of (2) _ij For the word x _i Scoring values for the j-th tag are classified. In the invention, 14 labels corresponding to 7 entities such as Load condition (Load), running State (State), power supply condition (Power) and 15 labels of non-entity labels represented by O are defined as shown in table 2.

Table 2 BIO labeling method corresponding label in micro grid dispatching field

/>

3) In the named entity recognition task, entity names are marked by a plurality of labels, so that serious dependency relationship exists among the labels, and based on the method, the access CRF model is selected to be used for classification decision instead of directly using a Softmax function at the output layer of BiLSTM. Therefore, the feature matrix P output in the BiLSTM layer is input to the CRF layer for the next classification labeling, and as can be seen from fig. 2, in the labeling of the sequence, the CRF model acts on the structure of the whole sentence, rather than the independent single position, and the final labeling score is affected by the adjacent states. If a tag sequence is described as y= { y ₁ ，y ₂ ，...，y _n For the input sequence x= { x } ₁ ，x ₂ ，...，x _n The score of the model predictive label equal to y is s= { s ₁ ，s ₂ ，...，s _n And the calculation formula is shown as (4). A state transition matrix M is introduced into the CRF layer, and each element of the matrix M

Representing the slave y _i Change to y _i+1 The possibility of marking a new location with previously marked information is achieved.

In the method, in the process of the invention,

the representation is state transition matrix score value in CRF model,/->

For the word x _i Classification into y _i Score value of the label. It is known that the score value of the BiLSTM-CRF model is composed of two parts, one of which is the output P of BiLSTM _i The decision, the other part, depends on the state transition matrix M of the CRF, and finally uses the Softmax function to perform normalization processing, so as to obtain a probability formula (5) that the result will finally obtain the model.

The BiLSTM-CRF model extracts global features of the text sequence data through the BiLSTM model, calculates scores by learning labeling rules among labels by the CRF model, and outputs the optimal label sequence.

(2) Micro-grid scheduling relation recognition model based on BiGRU-Attention

The relationship extraction (Named Entity Relation Extraction, NRE) is to determine whether there is a predefined relationship between entities based on the named entity recognition, thereby forming a series of triplet knowledge. The relationship categories among the partial entities defined by the invention are shown in table 3.

TABLE 3 micro-grid scheduling Domain partial relationship Categories

The invention uses a bi-directional gating (Bidirectional Gated Recurrent Unit Network, biGRU) structure with smaller parameter quantity on the basis of a BiLSTM model to improve the model training speed and introduces an Attention mechanism, finds words playing an important role in relation classification, learns to obtain a weight, and improves the importance of the words by giving the words higher weight, thereby improving the accuracy of relation extraction. The structure of the BiGRU-Attention model is shown in FIG. 3, which comprises:

1) Input layer: mapping sentences containing entity marking information by using a trained vector matrix to obtain a vector sequence x, x= { x ₁ ，x ₂ ，...，x _n X, where x _i Representing the input vector of the i-th word. The vector sequence is taken as an initial input value of the BiGRU network.

2) BiGRU layer: the BiGRU layer is composed of two-way GRUs, the forward GRU models semantic information from left to right, the backward GRU models semantic information from right to left, and the forward and backward GRU are spliced to obtain feature codes h extracted from each word at the current moment _t ＝[h _R ；h _L ]The calculation formula is the same as the formula (3), and the principle is the same as the principle of BiLSTM in the first part of this section, and will not be repeated here.

3) Attention layer: the Attention layer is added by using the Attention to assign different weights to each word to reflect different influences of the words on the relationship classification, and finally connecting the obtained Attention moment arrays. An output representation of the attention layer is calculated by equation (6).

Wherein N represents sentence length, h _t Indicating the external state at the present moment,a _t represents the weight at time t, a _t The calculation formula is as follows, and the calculation is obtained by calculating the coding vector at the moment and the coding vector at the latest moment:

h′ _t ＝tanh(W′h _t +b ^t ) (8)

in which W is ^t As a weight matrix, b ^t Is a bias term.

4) Output layer: comprises a full connection layer and a Sofimax layer, and outputs C of the attention layer _t Inputting the vector into a full-connection layer module, wherein the dimension of the output vector of the layer is equal to the number of the relations, and each dimension corresponds to the predictive value of the ith relation; the vectors learned by the full connection layer module are input into the Softmax layer to predict the probability of category relationships. And defining y as a relationship prediction result finally solved.

y(y|x)＝softmax(W ^s C _t +b ^s ) (9)

y＝arg max _y y(y|x) (10)

Wherein y is a relationship type, W ^s Weight parameters learned for classifier, b ^s Bias terms for the classifier.

(3) Knowledge fusion

Because the knowledge sources in the micro-grid dispatching field are wide, the information obtained through knowledge extraction needs to be subjected to entity disambiguation and coreference resolution processing through knowledge fusion. Wherein entity disambiguation refers to distinguishing between entities that may have multiple meanings; coreference resolution refers to combining nouns and pronouns having the same meaning and reference in a knowledge graph. Because the power industry has definite term specifications, the problem of entity ambiguity is basically not existed, and the invention focuses on solving the co-pointing problem. The main task is to find synonyms among all words representing entities/attributes, first using the regular expression [_4e00\u9fa 5 ]) (,)? (or)? (and)? (sum)? [_4e00\u9fa5 ] + finding out the end of the sentence with the default noun and the end of the entity, and then determining the boundary of the default data entity according to the recognition result of the named data entity to complement the data entity; secondly, dividing all words representing the entity/attribute into a plurality of sets according to parts of speech such as verbs, nouns, adjectives and the like, and respectively carrying out synonym recognition on each set; training the defect record corpus by a word2vec method for describing semantic similarity among words representing entities/attributes, selecting word vector dimensions as 100 dimensions to obtain word vectors corresponding to all words in the corpus, and judging the similarity among words representing the entities/attributes by calculating cosine similarity among the word vectors; deleting the word pairs with high cosine similarity appearing in the same defect record again, so as to remove adjacent word pairs and screen out parity word pairs; and finally forming a synonym table. Combining the same word pairs into a synonym set, forming a plurality of synonym sets, selecting one word from each set as the standardized names of all words in the set, and finally representing the synonym sets in the form of a synonym table.

(4) Knowledge storage

The current knowledge-graph may be stored in a variety of databases, including based on a native database, a relational database, a non-relational database, and the like. Because the map database can reduce the work of adding new tables or fields and the like when the new entities or relations exist in the knowledge map, the work efficiency is greatly improved, and the retrieval performance is improved, the Neo4j map database is selected for knowledge map storage.

The knowledge graphs in the micro-grid dispatching field constructed by the invention have 157 entities and 499 relations in the Neo4j graph database, and the entities and the relations are graphically displayed in the Neo4j graph database as shown in fig. 4.

2. Micro-grid dispatching strategy retrieval system based on knowledge graph

(1) Micro-grid scheduling strategy retrieval system architecture based on knowledge graph

Based on the body architecture of the knowledge graph in the micro-grid dispatching field, the BiLSTM-CRF model is utilized to realize the identification and attribute extraction of the micro-grid dispatching entity, the BiGRU-attribute model is utilized to conduct relation extraction and classification according to the micro-grid dispatching characteristics, and the knowledge graph construction in the micro-grid dispatching field is realized after the entity linking and knowledge completion. As shown in fig. 5, the micro-grid dispatching policy retrieval system architecture stores the micro-grid dispatching domain knowledge graph and the related data thereof by utilizing Neo4j, and updates information such as entities, relationships, attribute values and the like by adopting real-time data of a dispatching system.

When the real-time running data of each micro-grid is uploaded to a retrieval system, the corresponding scheduling strategy is given through the information analysis, micro-network state judgment and scheduling strategy retrieval 3 big modules. In the treatment process, the machine needs to prompt the screened main information, implicit knowledge, operation principle, special requirements and other contents for the modulator. After the scheduling treatment process is finished, the structured knowledge of the scheduling event is automatically extracted and imported into a case knowledge base for subsequent case recording, review and reasoning.

The information analysis module calculates and logically judges the operation data of each micro-grid, and matches the judging result with the entity and relation in the micro-grid dispatching field knowledge graph; judging and confirming the state of the micro-grid in a micro-grid state judging module according to the result of entity matching of the last module such as a regulation and control target, an energy storage condition and the like, and generating a search graph; and finally, searching a matched knowledge path in the knowledge graph of the micro-grid dispatching field by using a knowledge calculation engine through a dispatching strategy retrieval module to obtain a final result.

(2) Micro-grid scheduling strategy retrieval method based on sub-graph matching

The invention provides a micro-grid scheduling strategy retrieval algorithm based on sub-graph matching. The current state is calculated and judged according to the operation data of the sub-micro-grid, the current state is used as a keyword of a dispatching strategy to be searched, the dispatching strategy of the micro-grid under the condition can be obtained by analyzing key information by utilizing a knowledge graph technology and a sub-graph matching algorithm, the dispatching efficiency and the dispatching accuracy can be improved, and intelligent information service is provided for complex micro-grid dispatching. The micro-grid scheduling policy retrieval algorithm of the invention is divided into 4 steps altogether,

Step 1: and generating a retrieval graph Q according to the corresponding state of the current operation data of the micro-grid.

Step 2: the search pattern Q is divided into a plurality of sub-search patterns.

Step 3: and (3) executing the sub-search in the step (2) in the knowledge graph to obtain the results of all sub-searches.

Wherein, the search graph q= (E _Q ，R _Q ) Containing a set of points E _Q Sum edge set R _Q . Each search point corresponds to a specific entity description, and the edges represent the relationship between any two points (entities) connected. Map g= (E _G ，R _G ) The subgraph satisfying the mapping function F is defined as a matching subgraph phi (Q). The mapping function F refers to the mutual correspondence of elements between the knowledge graph and the subgraph, namely, the point E in Q is to be phi (Q) _Q Mapped to point phi (E in G _G ) Edge R in Q _Q Mapping to edges phi (R in G _G )。

For example, the current group power generation amount is larger than the group power consumption amount in the grid-connected operation state, wherein the power generation amount of the micro-grid A is 175 kW.h, the load power is 200 kW.h, and the SOC is 0.65 and larger than the SOC _min (value 0.2), the search map is formed by the information analysis module as shown in fig. 6.

1) Sub-search graph partitioning

The search graph is divided into a plurality of sub-search graphs, so that a single sub-search graph has the characteristics of small number of vertexes and single edge characteristic, and the aim of reducing search difficulty is fulfilled. The sub-graph search is divided into two layers of tree structures, and each sub-search graph comprises a root node, a layer of sub-nodes and edges. As shown in FIG. 7, the map Q will be retrieved using the rules described above _A Dividing into sub-search graphs Q _A1 、Q _A2 、Q _A3 、Q _A4 。

Sub-search map Q can be obtained by matching nodes and edges _A1 、Q _A2 、Q _A3 、Q _A4 And then obtain Q _A Is a search result of (a).

2) Sub-search execution

Firstly, decomposing the sub-search graph into a minimum spanning tree, and preferentially matching edges of the minimum spanning tree when matching the edges; then selecting the root node as a vertex with strong filtering capability of preferential matching; then, sub-search execution is carried out by combining the label characteristics of the graph on the basis of the conventional VF2 graph matching algorithm. The flow chart is shown in fig. 8.

3) Connection of sub-search results

And finally, linking all sub-search results together to generate a matching sub-graph. Taking the scheduling policy of the search example as an example, firstly, Q is executed respectively _A2 、Q _A3 、Q _A4 Sub-searching to obtain Q _A2 、Q _A3 、Q _A4 Results of three sub-searches; after which all sub-search results are concatenated. The search result is carried out by using hash connection if and only if Q _i And Q _j Q can be realized only when two sub-searches have a common vertex _i And Q _j Is connected with the search result of the (c). Q (Q) _A2 、Q _A3 、Q _A4 There is a common node "state 52", and its search result connection is shown in fig. 9 (above). Finally execute Q _A1 Is subjected to sub-search, and Q is obtained after connection _A As shown in fig. 9 (bottom).

The basic process of sub-search result connection is as follows.

Step 1: initializing sub-search result set C, and dividing sub-search into Q _i ∈(Q ₁ 、Q ₂ ……Q _n ) Executing all Q's according to sub-search execution method _i And obtaining the results of all sub-searches.

Step 2: hash connection is carried out on the search results of the n sub-searches, the results with the matching degree meeting the threshold gamma are stored in C, and the results are sorted according to the matching degree.

Step 3: and returning the search result set C to complete the search.

Examples:

the method takes numbers, tables and texts in a wind/light/storage micro-grid control strategy (Chinese electric power, 2013, 46 (2), 87-91) for peak clipping and valley filling scheduling of a main grid as experimental objects, and converts the experimental objects into unified and standard 'entity/attribute-relation-entity/attribute' triples after knowledge-graph entity identification and relation extraction, wherein the knowledge graph of the micro-grid scheduling field finally constructed according to the system and the method provided by the invention contains 157 entities/attributes and 499 relations.

The LSI (Latent Semantic Indexing, latent semantic index) is adopted, the LDA (Latent DirichletAllocation, three-layer Bayesian probability model) and the knowledge graph model of the invention respectively search 1000 matching records of the scheduling records in the scheduling manual, and the knowledge graph model search accuracy is 91.54 percent, which has obvious advantages compared with the LSI model accuracy 35.87 percent and the LDA model accuracy 40.27 percent.

What is not described in detail in this specification is prior art known to those skilled in the art.

Claims

1. The micro-grid dispatching strategy intelligent retrieval system based on the knowledge graph is characterized by comprising an information analysis module, a micro-grid state judgment module and a micro-grid dispatching field knowledge graph and dispatching strategy retrieval module;

2. The knowledge-graph-based intelligent retrieval system for a micro-grid scheduling strategy according to claim 1, wherein: the steps of storing the parsed structured data into the knowledge graph in the micro-grid dispatching field are as follows:

step 1, connecting a database for initializing operation;

step 2, constructing SQL sentences and carrying out data query;

step 3, converting data types, structures and attributes;

3. The knowledge-graph-based intelligent retrieval system for a micro-grid scheduling strategy according to claim 1, wherein: the data entity uses a BiLSTM-CRF model to carry out identification and attribute extraction, and the specific method comprises the following steps:

step 1, word segmentation is carried out on a micro-grid dispatching strategy text by using jieba, word2vec of a genesim tool kit is used for training to obtain a word vector matrix, the word vector matrix is used for mapping a text to be recognized, and a word vector sequence x, x= { x is formed ₁ ,x ₂ ,...,x _n X, where x _t An input vector representing a t-th word, where t=1, 2,3 … n;

h _R ＝f(Wx _t +Uh _t-1 +b) (1)；

h _L ＝f(Wx _t +Uh _t+1 +b) (2)；

where f () is a nonlinear activation function, w= (W) ₁ ,w ₂ ,...,w _n ) ^T For the state-input weight matrix, u= (U) ₁ ,u ₂ ,...,u _n ) ^T Is a state-state weight matrix, x _t For the input of the current moment of time,

step 3, transmitting the complete vector sequence to an output layer, mapping m-dimensional vectors to k-dimensional vectors, wherein k represents all label numbers in the labeling set, mapping the label numbers into an n multiplied by k-dimensional feature matrix P, and outputting the n multiplied by k-dimensional feature matrix P, wherein P= (P) ₁ ,p ₂ ,...,p _n )∈R ^n×k Then p _i ∈R ^k Each bit P of (2) _ij For the word x _i Scoring values categorized into jth tags;

in the above formula; s (x, y) is the input sequence x= { x ₁ ,x ₂ ,...,x _n Predicting the tag via CRF layer to be equal to the fraction of tag sequence y, tag sequence y= { y ₁ ,y ₂ ,...,y _n }；

The score value of the state transition matrix in the CRF model is used for representing the y of each element in the state transition matrix M _i Change to y _i+1 Is to be used as a potential for a vehicle; />

For the word x _i Classification into y _i A score value for the tag;

in the above formula, P (y|x) is the probability value of the input sequence x classified into the tag sequence Y, Y 'represents one possible tag sequence, Y' ∈y (x), Y (x) represents all possible tag sequences, Σ _y'∈Y(x) exp (s (x, y')) represents the sum of all label sequence scores, and the y with the largest output probability value is the final label sequence.

4. The knowledge-graph-based intelligent retrieval system for a micro-grid scheduling strategy as claimed in claim 3, wherein: the data entity relation uses a BiGRU-attribute model to extract and classify the data entity relation, and the specific method comprises the following steps:

step 1, word segmentation is carried out on a micro-grid dispatching strategy text by using jieba, word2vec of a genesim tool kit is used for training to obtain a word vector matrix, the word vector matrix is used for mapping a text to be recognized, and a word vector sequence x, x= { x is obtained ₁ ,x ₂ ,...,x _n X, where x _i An input vector representing an i-th word, where i=1, 2,3 … n;

step 3, distributing different weights to each word in the attention layer, connecting the obtained attention moment arrays, and calculating to obtain the output representation C of the attention layer _t The following formula (6) shows:

y(y|x)＝soft max(W ^s C _t +b ^s ) (7)；

y＝arg max _y y(y|x) (8)；

in the above formula: y is the category of the relationship and,the final solved relation prediction result is obtained; w (W) ^s Weight parameters learned for the classifier; b ^s Bias terms for the classifier.

5. The knowledge-graph-based intelligent retrieval system for a micro-grid scheduling strategy according to claim 4, wherein: the construction method of the knowledge graph in the micro-grid dispatching field comprises the following steps:

[_4e00\u9fa 5 ]) ()? (or)? (and)? (sum)? [_4e00\u9fa 5] + (9);

6. The intelligent retrieval method of the micro-grid dispatching strategy based on the knowledge graph is characterized by comprising the following steps of:

step 2: dividing the search map Q into a plurality of sub-search maps;

step 4: connecting the sub-search results obtained in the step 3 to generate a matching sub-graph, namely a final search result;

wherein q= (E _Q ,R _Q ) Set E containing nodes _Q And the collection of edges R _Q The method comprises the steps of carrying out a first treatment on the surface of the E above _Q Corresponding to the node in (a) and R _Q The edges in (a) represent the relationship between any two nodes;

mapping the knowledge graph g= (E _G ,R _G ) The subgraph satisfying the mapping function F is defined as a matching subgraph φ (Q), i.e., φ (Q) will node E in Q _Q Mapping to node phi (E in G _G ) Edge R in Q _Q Mapping to edges phi (R in G _G ) The method comprises the steps of carrying out a first treatment on the surface of the The mapping function F refers to the mutual correspondence relationship of elements between the knowledge graph and the matching subgraph; e above _G R is the set of nodes in the knowledge graph G _G Is the collection of edges in the knowledge graph G; e above _G Corresponding to the node in (a) and R _G The edges of (a) represent the relationship between any two nodes.

7. The knowledge-graph-based intelligent retrieval method for the micro-grid scheduling strategy, as set forth in claim 6, is characterized in that:

8. The knowledge-graph-based intelligent retrieval method for the micro-grid scheduling strategy, as set forth in claim 6, is characterized in that: the step 4 specifically comprises the following steps:

and 4-3, returning a search result set C to finish the search.