CN113642862A

CN113642862A - Method and system for identifying named entities of power grid dispatching instructions based on BERT-MBIGRU-CRF model

Info

Publication number: CN113642862A
Application number: CN202110864643.0A
Authority: CN
Inventors: 杨梓俊; 荆江平; 孙昕杰; 张刘冬; 吴海洋; 王黎明; 杨明; 申张亮; 邓晨; 赵帅; 蒋雪冬
Original assignee: State Grid Jiangsu Electric Power Co Ltd
Current assignee: State Grid Jiangsu Electric Power Co Ltd
Priority date: 2021-07-29
Filing date: 2021-07-29
Publication date: 2021-11-12

Abstract

The invention discloses a method and a system for identifying a named entity of a power grid dispatching instruction based on a BERT-MBIGRU-CRF model, wherein the system comprises the following steps: the system comprises a sample preprocessing module, a BERT pre-training module, an MBIGRU feature extraction module, a CRF training module and an output recognition result module. The method comprises the following steps: preprocessing a power grid dispatching instruction before training, building a BERT-MBIGRU-CRF model, training and modeling a historical power grid dispatching instruction training set by adopting the model, generating a neural network model with power grid characteristics, predicting a prediction sample by a completely trained named entity recognition system, and finally obtaining a named entity recognition result. The invention utilizes a multilayer bidirectional neural network Model (MBIGRU) to deeply depict important characteristics of the scheduling instruction, extracts context information in a bidirectional way and effectively improves the accuracy of named entity identification of the power grid scheduling instruction.

Description

Method and system for identifying named entities of power grid dispatching instructions based on BERT-MBIGRU-CRF model

Technical Field

The invention relates to the field of power grid dispatching instruction identification, in particular to a power grid dispatching instruction named entity identification method and system based on a BERT-MBIGRU-CRF model.

Background

Electric power systems are increasingly intelligent, and in order to be matched with an intelligent control system introduced into a power grid, scheduling instructions of scheduling personnel are intelligently identified in the daily scheduling process of a distribution network, and the correctness, the normalization and the scheduling instructions of the command receiving and sending process between the scheduling personnel and field personnel are ensured to follow the scheduling safety standard. The acceleration of power grid intellectualization drives artificial intelligence to replace manual work, and the identification of dispatching instruction receiving and sending contents in the power grid dispatching process is very important.

At present, many language recognition methods in natural language processing exist, such as Hidden Markov Models (HMMs), Conditional Random Fields (CRFs), long and short term memory neural networks (LSTM), and the like, and the named entity recognition methods have the defects that contexts cannot be labeled or data features need to be manually extracted, and cannot accurately recognize data sets. With the continuous acceleration of the intelligent speed of the power grid, the increase of the data volume and the fluctuation, randomness and the like of the data make the traditional semantic identification method more and more unable to meet the requirements of practical application.

Disclosure of Invention

In order to solve the defects in the prior art, the invention aims to provide a method and a system for identifying a named entity of a power grid dispatching instruction based on a BERT-MBIGRU-CRF model.

In order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows:

a method for identifying a named entity of a power grid dispatching instruction based on a BERT-MBIGRU-CRF model comprises the following steps:

(1) preprocessing the power grid dispatching instruction before training to obtain a labeled power grid dispatching instruction named entity recognition training set;

(2) building a BERT-MBIGRU-CRF power grid dispatching instruction identification model and completing training;

(3) and identifying the current power grid dispatching instruction by using the well-trained BERT-MBIGRU-CRF power grid dispatching instruction identification model, and outputting an identification result.

Further, the step (1) specifically includes:

(1.1) eliminating punctuation marks and special characters in a power grid dispatching instruction sample, correcting wrongly written or mispronounced characters, and normalizing;

and (1.2) marking each single-sentence power grid dispatching instruction by adopting a named entity marking method.

Further, the step (2) specifically includes:

(2.1) pre-training by using a BERT model to obtain a power grid dispatching instruction vector;

(2.2) extracting features by using the MBIGRU model to generate feature vectors;

and (2.3) marking constraint on the extracted feature vectors by using a CRF model, calculating a loss function, and stopping iteration when a loss value reaches a threshold value to finish training.

Further, in the step (2.1),

the BERT model comprises an embedding layer, a bidirectional Transformer encoder and a pooling layer; and respectively capturing word-level and sentence-level expressions through a masking language model and a next sentence prediction task, and performing combined training.

Further, the step (2.1) specifically includes:

(2.1.1) converting the preprocessed power grid dispatching instruction named entity recognition training set into word vectors, text vectors and position vectors which are used as BERT model input, and entering an embedding layer;

(2.1.2) the embedding layer divides the power grid dispatching instruction into distributed expression vectors; in the embedding process, a [ CLS ] is used for marking a first mark on the power grid dispatching instruction, and an [ SEP ] is used for marking a second mark on the power grid dispatching instruction;

(2.1.3) carrying out nonlinear representation on each power grid dispatching instruction by using a bidirectional transformer coder, and bidirectionally connecting context information by using a position coding memory word vector sequence;

and (2.1.4) outputting a power grid dispatching instruction vector and an integral sequence representation by the BERT model through a pooling layer.

Further, in the step (2.2), the MBIGRU model includes two gating structures, namely, an update gate and a reset gate, and the reset gate and the update gate together determine the output of the hidden state.

Further, the updated gate in the MBIGRU controls the weight of the previous time information to be substituted into the current time, and the updated gate state value z_tThe formula is as follows:

z_t＝f(W_zx_t+U_zh_t-1)

where f denotes the sigmod function, x_tRepresenting the input vector at time t, h_t-1Represents the hidden layer state at time t-1, W_zRepresenting an updated gate weight matrix, U_zRepresenting an updated gate bias matrix;

reset gate in MBIGRU controls whether previous time information was discarded or entered into a candidate state

Reset gate state value r_tThe formula is as follows:

r_t＝f(W_rx_t+U_rh_t-1)

wherein, W_rRepresenting a reset gate weight matrix, U_rRepresenting a reset gate bias matrix;

the output value of MBIGRU is formed by the series value of the previous time step and the current time step and r_tMultiplication, output state h_tThe formula is as follows:

where g represents the activation function and the output state value depends on the update gate state and the reset gate state.

Further, the MBIGRU model stacks forward and reverse GRU models on a single structure, whose model structure is represented as follows:

h_t＝f(Ux_t+Wh_t-1)

h_t＝f(U′x_t+W′h′_t-1)

o_t＝g(Vh_t+V′h′_t)

wherein f and g represent activation functions, U, W, V represents weights in forward operation, U ', W ' and V ' represent weights in reverse operation, and o_tDenotes an output value, h'_tRepresenting the hidden layer state at time t in the reverse state.

Further, the MBIGRU model is superposed on the forward and reverse GRU models for multiple times, and the model structure is expressed as follows:

……

wherein, O_tAn output value representing a multi-layer forward and backward result,

indicating the hidden layer state at the instant of the i-th layer reverse state t.

A power grid dispatching instruction named entity recognition system based on a BERT-MBIGRU-CRF model comprises a sample preprocessing module, a BERT pre-training module, an MBIGRU feature extraction module, a CRF training module and an output recognition result module;

the sample preprocessing module is used for preprocessing the power grid dispatching instruction before training to obtain a labeled power grid dispatching instruction named entity recognition training set;

the BERT pre-training module is used for inputting a preprocessed power grid dispatching instruction named entity recognition training set as a BERT model, respectively capturing word-level and sentence-level expressions through a masking language model and a next sentence prediction task, performing combined training and outputting a power grid dispatching instruction vector;

the MBIGRU feature extraction module is used for extracting the features of the power grid dispatching instruction vector which is completely trained by using the MBIGRU model to generate a feature vector;

the CRF training module is used for labeling and constraining the extracted feature vectors by using a CRF model, calculating a loss function, and stopping iteration when a loss value reaches a threshold value to finish training;

and the output recognition result module is used for recognizing the current power grid dispatching instruction by using the well-trained BERT-MBIGRU-CRF model and outputting a recognition result.

The invention has the advantages that compared with the prior art,

compared with most of traditional natural language processing models, the BERT model can acquire more language intrinsic information more accurately, and the accuracy of named entity recognition is improved.

The deep learning model based on the MBGRU deepens the dependency of context information through a multilayer bidirectional structure, deeply describes the internal characteristics of a language, and can better extract the characteristics of a power grid dispatching instruction.

The method for recognizing the named entity based on the BERT-MBGRU-CRF starts from a power grid dispatching instruction data set, trains a neural network model meeting power grid dispatching specifications, and greatly improves the accuracy of dispatching instruction recognition when forecasting sentences.

Drawings

FIG. 1 is a flow chart of a method for identifying a named entity of a power grid dispatching instruction based on a BERT-MBIGRU-CRF model according to an embodiment of the invention;

FIG. 2 is a block diagram of a BERT model according to an embodiment of the present invention;

FIG. 3 is a block diagram of a BERT-MBIGRU-CRF model according to an embodiment of the present invention;

fig. 4 is a diagram of the MBGRU model structure according to the embodiment of the present invention.

Detailed Description

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.

The invention relates to a power grid dispatching instruction named entity recognition system based on a BERT-MBIGRU-CRF model, which comprises a sample preprocessing module, a BERT pre-training module, an MBIGRU feature extraction module, a CRF training module and an output recognition result module.

And the sample preprocessing module is used for preprocessing the large-scale original power grid dispatching instruction before training to obtain the marked power grid dispatching instruction named entity recognition training set.

And the BERT pre-training module is used for converting the preprocessed power grid scheduling instruction named entity recognition training set into word vectors, text vectors and position vectors which are used as input of a BERT model, respectively capturing word-level and sentence-level expressions through two tasks of a masked language model and a next sense prediction by using the BERT model, and performing combined training. The BERT model outputs vector and integral sequence representation after the dispatching instruction is fused corresponding to the words in the language.

And the MBIGRU feature extraction module is used for extracting the data features of the data set which is completely trained by using the MBIGRU neural network to generate feature vectors.

The CRF training module takes the feature vectors output by the neural network as input, and the CRF layer marks and constrains the word vectors after the neural network features are extracted, so that reasonable dependency among the word vectors is increased; a loss function is calculated and when the loss value reaches a certain level, the iteration is stopped.

And the output recognition result module is used for recognizing the current scheduling instruction by utilizing the well-trained BERT-MBIGRU-CRF neural network model and outputting a recognition result.

As shown in fig. 1, the method for identifying the named entity of the power grid dispatching instruction based on the BERT-MBIGRU-CRF model of the invention comprises the following steps:

(1) preprocessing a large-scale original power grid dispatching instruction before training to obtain a labeled power grid dispatching instruction named entity recognition training set;

the pretreatment specifically comprises the steps of:

(1.1) eliminating punctuation marks and special characters in the sample acquisition instruction, correcting wrongly written characters and standardizing;

and (1.2) marking by adopting a named entity marking method, namely a BIOSE marking method, wherein the power grid dispatching instruction is generally a single sentence, and marking can be carried out on each sentence of the single sentence.

(2) Building a BERT-MBIGRU-CRF power grid dispatching instruction identification model, and completing training, as shown in FIG. 3;

as shown in FIG. 2, the BERT model includes an embedding layer, a bi-directional Transformer encoder, and a pooling layer. Word-level and sentence-level representations are captured and jointly trained by two tasks, a masked language model and a next sense prediction.

And converting the preprocessed power grid dispatching instruction named entity recognition training set into a word vector, a text vector and a position vector to be used as BERT model input, and entering an embedding layer. The embedded layer divides the power grid dispatching instruction into distributed expression vectors; during embedding, the [ CLS ] is used for marking the scheduling instruction firstly, and the [ SEP ] is used for marking the scheduling instruction secondly. The bidirectional transformer encoder performs nonlinear representation on each sentence scheduling instruction, and uses position coding memory word vector sequence to bidirectionally connect context information. And after passing through the pooling layer, the BERT model outputs vector and integral sequence representation after the dispatching instruction is fused and corresponding to the words in the language.

and inputting the power grid dispatching instruction vector output by the BERT model into the MBIGRU model, and extracting the training set data characteristics with complete training by using the multi-layer bidirectional cyclic neural network MBIGRU to generate the characteristic vector.

(2.3) completing training by using a CRF model;

inputting the feature vectors output by the MBIGRU model into a CRF model, wherein the CRF model labels and constrains the feature vectors after the neural network features are extracted, and reasonable dependence among the feature vectors is increased; a loss function is calculated and when the loss value reaches a certain level, the iteration is stopped.

(3) And identifying the current power grid dispatching instruction by using the well-trained BERT-MBIGRU-CRF neural network model, and outputting an identification result.

As shown in fig. 4, the MBIGRU neural network model is improved on the gating structure of LSTM, and the model structure only includes two gating structures, namely, an update gate and a reset gate, which together determine the output of the hidden state.

The update gate in the MBIGRU controls the weight of the previous time information brought into the current time, and updates the state value z of the gate_tThe formula is as follows:

z_t＝f(W_zx_t+U_zh_t-1)

where f denotes the sigmod function, x_tRepresenting the input vector at time t, h_t-1Represents the hidden layer state at time t-1, W_zRepresenting an updated gate weight matrix, U_zIndicating that the gate bias matrix is updated.

Reset gate state value r_tThe formula is as follows:

r_t＝f(W_rx_t+U_rh_t-1)

wherein, W_rRepresenting a reset gate weight matrix, U_rRepresenting a reset gate bias matrix.

The output value of MBIGRU is formed by the series value of the previous time step and the current time step and r_tMultiplication, output state h_t：

The MBIGRU model structure stacks forward and reverse GRU models on a single GRU structure, whose model structure is represented as follows:

h_t＝f(Ux_t+Wh_t-1)

h_t′＝f(U′x_t+W′h′_t-1)

o_t＝g(Vh_t+V′h′_t)

The MBIGRU model structure is superposed for many times on the basis of forward and reverse GRU models, and the model structure is expressed as follows:

……

The invention has the advantages that compared with the prior art,

The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.

Claims

1. A method for identifying a named entity of a power grid dispatching instruction based on a BERT-MBIGRU-CRF model is characterized by comprising the following steps:

2. The method for identifying the named entity of the power grid dispatching instruction based on the BERT-MBIGRU-CRF model as claimed in claim 1, wherein the step (1) specifically comprises:

3. The method for identifying the named entity of the power grid dispatching instruction based on the BERT-MBIGRU-CRF model as claimed in claim 1, wherein the step (2) specifically comprises:

4. The BERT-MBIGRU-CRF model-based power grid dispatching instruction named entity identification method of claim 3, wherein in the step (2.1),

5. The method for identifying the named entity of the power grid dispatching instruction based on the BERT-MBIGRU-CRF model as claimed in claim 4, wherein the step (2.1) specifically comprises the following steps:

6. The method according to claim 3, wherein in step (2.2), the MBIGRU model comprises two gating structures, namely an update gate and a reset gate, and the reset gate and the update gate together determine the output of the hidden state.

7. The BERT-MBIGRU-CRF model-based power grid dispatching instruction named entity identification method of claim 6, wherein,

z_t＝f(W_zx_t+U_zh_t-1)

Reset gate state value r_tThe formula is as follows:

r_t＝f(W_rx_t+U_rh_t-1)

8. The method for identifying the named entity of the power grid dispatching instruction based on the BERT-MBIGRU-CRF model as claimed in claim 7, wherein the MBIGRU model is a multi-layer stack of forward and reverse GRU models on a single structure, and the model structure is represented as follows:

h_t＝f(Ux_t+Wh_t-1)

h′_t＝f(U′x_t+W′h′_t-1)

o_t＝g(Vh_t+V′h′_t)

9. The method for identifying the named entity of the power grid dispatching instruction based on the BERT-MBIGRU-CRF model as claimed in claim 8, wherein the MBIGRU model is superimposed many times on the basis of the forward and reverse GRU models, and the model structure is represented as follows:

indicating the i-th layer reversalHidden layer state at time t.

10. A power grid dispatching instruction named entity recognition system based on a BERT-MBIGRU-CRF model is characterized by comprising a sample preprocessing module, a BERT pre-training module, an MBIGRU feature extraction module, a CRF training module and an output recognition result module;