CN113312500B

CN113312500B - Method for constructing event map for safe operation of dam

Info

Publication number: CN113312500B
Application number: CN202110702542.3A
Authority: CN
Inventors: 毛莺池; 季佳丽; 肖海斌; 程永; 苏茂; 吴威; 王龙宝; 陈豪; 简树明; 丁玉江; 谭彬; 张润; 刘锦; 岳宏斌; 赵盛杰; 熊成龙; 沈凤群; 冉龙明; 娄毅博; 李旭
Original assignee: Hohai University HHU; Huaneng Group Technology Innovation Center Co Ltd; Huaneng Lancang River Hydropower Co Ltd
Current assignee: Hohai University HHU; Huaneng Group Technology Innovation Center Co Ltd; Huaneng Lancang River Hydropower Co Ltd
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2022-05-03
Anticipated expiration: 2041-06-24
Also published as: CN113312500A

Abstract

The invention discloses a construction method of an event map for safe operation of a dam, which comprises the following steps: (1) embedding enhanced semantic information using word vectors; (2) local attention is introduced, global attention captures keywords and document information is mined; (3) solving the sample imbalance problem and extracting event types by using Focalloss; (4) adding event type codes to form a new sentence code in a serial connection manner, and processing an embedded vector after the serial connection by using BilSTM; (5) converting an attention network and an attention network according to a proportional fusion graph, extracting effective characteristics, and capturing corresponding event arguments; (6) filling the network with arguments to fill in the document-level arguments missing in the event; (7) and converting the attention network and the attention network labeling sequence by the weighted fusion graph, acquiring the causal relationship between the events and constructing an event graph.

Description

Method for constructing event map for safe operation of dam

Technical Field

The invention relates to an event map construction method for dam safe operation, which extracts dam operation condition events and event arguments thereof in text data through an event detection and extraction model based on double attention to form an event knowledge map. The method aims to automatically extract events and arguments thereof from a large number of dam operation record texts and generate a dam event knowledge graph.

Background

The concept of knowledge graph was proposed by Google in 2012, and was first used by search engines for entity-based searches instead of string-based searches, thereby improving the user search quality and experience. In the big data age, the knowledge graph expresses the information of the internet in a structured form to a form closer to the human cognitive world, and provides the capability of better organizing, managing and understanding the mass information of the internet.

In the big data era, manual labor cannot meet the construction requirement of the knowledge graph. Many enterprises begin to actively explore and try automatic construction technologies, and extract data from different sources and structures by using machines to form knowledge which is stored in a knowledge graph. In industrial practice, knowledge graph construction through extracting knowledge from unstructured data such as text information and the like faces many technical challenges.

Event relation extraction is compared with entity relation extraction, the relation between two events needs to be judged, the description of the events in the text is usually complex and may be a sentence or a plurality of sentences, and therefore the difficulty of establishing a knowledge graph with the events as the center is larger than that of establishing the entity knowledge graph.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problems in the prior art, the invention provides an event map construction method facing dam safe operation, which is used for constructing a map with dam safe operation events as a center, and the map with the dam safe operation events as the center can better reflect the cause and time sequence relation between the events, thereby improving the capability of dam managers for dealing with future emergency events.

The technical scheme is as follows: an event map construction method for safe operation of a dam comprises the following steps:

(1) and converting sentences and documents containing the information of the dam safe operation events into feature vectors by using an ALBERT embedding layer, enhancing the semantic information of Chinese, and processing the feature vectors converted by each word by using BilSTM.

(2) Introducing a local attention simulation event trigger, distributing corresponding weight of each word according to the importance degree, taking the word with the highest weight value as a hidden event trigger, introducing keywords and document context information in the global attention learning sentence, obtaining the unique meaning of the trigger in the scene, and assisting in judging the event type of the sentence.

(3) Training a trigger-free event detection model, adopting Focal loss as a loss function in the training process of the trigger-free event detection model, solving the problem of sample imbalance, simultaneously strengthening the influence of a positive sample and a difficultly-divided sample on the model, and outputting a predicted event type.

(4) And connecting event type coding vectors in series behind the feature vectors to form new sentence codes, and capturing context information by processing the embedded vectors after connection through the BilSTM.

(5) According to the sentence structure generated by dependency syntax analysis and the semantic vector generated by BilSTM, the attention mode is introduced to convert the network layer and the characteristics extracted by the attention network layer according to the weight fusion graph, a new expression vector is generated, and the event argument is extracted in a BIO sequence labeling mode.

(6) And judging key events in the key sentences by adopting the TextCNN, and supplementing missing event roles by using filling words in adjacent sentences to realize missing argument extraction of the events and supplement an event knowledge graph.

(7) And converting the attention network and the attention network by the weighted fusion graph to label sequences, acquiring causal relationships among events and constructing an event graph.

Further, in the step (1), sentences and documents containing the information of the safe operation events of the dam are converted into feature vectors by using an ALBERT embedding layer, semantic information of Chinese is enhanced, and the specific steps of processing the feature vectors converted by each word by using BilSTM are as follows:

(1.1) taking the operation records of equipment under the daily working condition and the emergency working condition of the dam as a fine tuning training corpus, then performing fine tuning training on a pre-trained ALBERT model, converting sentences into feature vectors W in a mathematical form, dynamically learning context information of a document, training different feature vectors of the same-name vocabulary according to the document information, and avoiding the problem of word ambiguity;

(1.2) processing the sentence feature vector W using the BilSTM network, outputting two hidden states

And

synthesis of

Sentence context information is denoted by h.

Further, the step (2) introduces a local attention simulation event trigger, assigns a corresponding weight value to each word according to the importance degree, takes the word with the highest weight value as a hidden event trigger, introduces keywords and document context information in the global attention learning sentence, obtains the unique meaning of the trigger in the scene, and assists in judging the event type of the sentence as follows:

(2.1) introduce a local attention mechanism to output LSTM output vector h, event type feature toQuantity t₁As input, use the formula

Obtaining a local attention vector alpha_sWherein h is_kIs the kth part of the output vector h,

is the local attention vector alpha_sThe kth part of (1);

(2.2) introducing a global attention mechanism, and embedding the output vector h and the event type into the vector t₂And a document level embedded vector d is used as input, and a formula is used

Computing a global attention embedding vector α_dWherein h is_kIs the kth part of the output vector h,

is the global attention vector alpha_dThe kth part of (1);

(2.3) for α_sAnd t₁Using dot product operations, generating v_sEvent trigger to capture local features and simulate concealment, for alpha_dAnd t₂Using dot product operations, generating v_dGlobal feature and context information is captured, followed by the Sigmoid function o- σ (λ · v)_s+(1-λ)·v_d) Processing the weighted-averaged dual attention vector v_sAnd v_d，λ∈[0,1]Is at v_sAnd v_dA hyper-parameter to make a trade-off between.

Further, in the step (3), Focal loss is adopted as a loss function in the model training process, the influence of the positive sample and the difficult sample on the model is strengthened while the problem of sample imbalance is solved, and the specific steps of outputting the predicted event type are as follows:

(3.1) during the model training process, the Focal loss is used as a loss function J (theta), and the formula is as follows:

where x is composed of sentences and target event types, y ∈ {0,1}, o (x)⁽ⁱ⁾) Is the predicted value of the model, | theta | | Y luminance²Is the sum of squares of each element in the model, δ > 0 is the weight of the L2 normalization term, β is the parameter of the proportion of positive and negative weights of the balanced samples, and γ is the parameter of the proportion of hard-to-classify and easy-to-classify weights of the balanced samples. Adding L2 regularization to the loss function prevents overfitting of the model, after which the model is trained;

and (3.2) testing on a dam operation condition data set by using the trained model, and outputting most possible event type numbers contained in each sentence according to preset possible event types.

Further, the step (4) concatenates the event type encoding vectors after the feature vectors to form a new sentence code, and captures the context information by processing the concatenated embedded vectors through BilSTM as follows:

and (3.2) adding the obtained sentence prediction event type vector into the sentence coding, and connecting the sentence prediction event type vector with word embedding, entity type embedding and part of speech tagging in series to form a 312-dimensional vector as an embedded vector which is used as the input of a BilSTM model to obtain a hidden vector sequence as a text semantic structure.

Further, the step (5) introduces the attention mode according to the characteristics extracted by the weight fusion graph conversion network layer and the attention network layer according to the sentence structure generated by the dependency syntax analysis and the semantic vector generated by the BilSTM, generates a new expression vector, and extracts the event argument by the BIO sequence labeling mode, which comprises the following specific steps:

(5.1) use of a binary matrix A of N^dAs a syntactic structure, when a word w_iAnd w_jWith links in the dependency tree, then A^d(i, j) is set to 1, otherwise 0;

(5.2) if w_iAnd w_jThere is a dependency edge between and the dependency label is r, and a is initialized by using the embedded vector of r embedded in the lookup table^dl(i, j), otherwise using p-dimensional all zerosVector initialization A^dl(i, j), then using the formula

Will depend on the label matrix A^dlConversion to dependency tag score matrix

Wherein U is a trainable weight matrix;

(5.3) calculating a hidden vector h_iAnd h_jScore between them to obtain semantic score matrix A^sThe calculation formula is as follows:

k_i＝U_kh_i,q_i＝U_qh_i,

wherein U is_kAnd U_qIs a trainable weight matrix;

(5.4) Adjacent dependency Tree to matrix A^dDependent tag matrix A^dlAnd semantic score matrix A^sObtaining a dependency graph matrix by cascading

(5.5) it proposes graph transformation attention networks GTANs, which adopt 1 x 1 convolution to the adjacent matrix A set and soft select two intermediate adjacent matrixes Q₁And Q₂Generating a new meta-path graph A by matrix multiplication^lCompare path graph A^lEach channel employs a GAT network and represents a plurality of nodes in series as Z, the formula is as follows:

where, | | is the join operator, C represents the number of channels,

is A^lThe contiguous matrix of the ith channel of (a),

is that

V is a trainable weight matrix shared across channels, and X is a feature matrix;

(5.6) introducing an attention mechanism, and calculating an attention network layer weight matrix

And then, using sigmoid function to activate weighted fusion graph to convert characteristics of attention network layer and attention network layer

(5.7) using sequence labeling to label the event, labeling the beginning part of a key argument B, labeling the middle part of the key argument I, labeling other words except the key argument in a sentence O, and then using a conditional random field CRF to process a feature fusion vector

And outputs a predicted argument tag for each character in the specified event.

Further, the step (6) adopts the TextCNN to judge the key events in the key sentences, and then uses the filler words in the adjacent sentences to supplement the missing event roles to realize the missing argument extraction, and the specific steps of supplementing the event knowledge graph are as follows:

(6.1) filling arguments missed in the event, connecting four embedded vectors of argument labels, entity types, sentences and documents in series to form 880-dimensional new vectors, setting 128 vectors input by processing with convolution kernel sizes of 3, 4 and 5, projecting to 128 dimensions, and then judging whether the sentences contain key events through pooling and full connection layers;

and (6.2) for sentences containing key events, calculating the similarity between the rest sentences in the same document and the key sentences by using a MalSTM model, sequencing, and searching for argument roles in adjacent sentences which have the key events corresponding to the missing arguments and the highest similarity and filling the argument roles.

Further, the step (7) of converting the attention network and the attention network labeling sequence by the weighted fusion graph to obtain the causal relationship between the events comprises the following specific steps of:

constructing a frame comprising an embedding layer, a bidirectional long-short memory layer, a feature extraction layer, a fusion gate layer and a conditional random field layer, and carrying out BIO and CE labeling on the sequence, wherein B represents the beginning of an event argument, I represents the inside of the event argument, C is a reason, E is a result, and O is other words. B-C and I-C sequences are causal events, B-E and I-E sequences are effect events, so that the causal relationship among the events is constructed, and the knowledge graph is constructed.

Has the advantages that: compared with the prior art, the dam-oriented safe operation event map construction method based on the double attention mechanism avoids the situations that an event trigger has multiple meanings and words are not matched with the trigger. By means of local attention, semantic information is fully mined, an event trigger word is extracted by replacing a trigger with the importance degree, and the problem that the word is not matched with the trigger is solved; by means of global attention, the intermediate keywords and the document context information are learned, the unique meaning of the trigger in the scene is obtained, and the trigger ambiguity problem is solved. And finally, a Focal loss function training model is used for learning the safe operation record text data of the dam, and an event knowledge graph taking the event as the center is automatically constructed, so that the labor cost is saved, meanwhile, a knowledge base is established for the dam, the knowledge is stored in a structured form, and a foundation is provided for establishing event-driven application of the dam in the future.

Drawings

FIG. 1 is a flow diagram of a method of an embodiment;

FIG. 2 is a partial result graph of an event graph for a specific embodiment.

Detailed Description

The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, which is to be given the full breadth of the claims appended hereto.

As shown in fig. 1, a method for constructing an event graph for safe operation of a dam includes the following steps:

(1.1) taking the operation records of equipment under the daily working condition and the emergency working condition of the dam as a fine tuning training corpus, then performing fine tuning training on the pre-trained ALBERT model, converting sentences into feature vectors W in a mathematical form, and dynamically learning document context information;

And

synthesizing into

Sentence context information is denoted by h.

(2.1) introducing a local attention mechanism, and outputting an LSTM output vector h and an event type feature vector t₁As input, use the formula

To gain local attentionQuantity alpha_sWherein h is_kIs the kth part of the output vector h,

is the local attention vector alpha_sThe kth part of (1);

is the global attention vector alpha_dThe kth part of (1);

(3) And in the model training process, the Focal loss is used as a loss function, the influence of the positive sample and the difficultly-divided sample on the model is enhanced while the problem of sample imbalance is solved, and the type of a predicted event is output.

where x is composed of sentences and target event types, y ∈ {0,1}, o (x)⁽ⁱ⁾) Is a model forecastMeasuring value, | θ | | non-conducting phosphor²Is the sum of squares of each element in the model, δ > 0 is the weight of the L2 normalization term, β is the parameter of the proportion of positive and negative weights of the balanced samples, and γ is the parameter of the proportion of hard-to-classify and easy-to-classify weights of the balanced samples. Adding L2 regularization to the loss function prevents overfitting of the model, after which the model is trained;

and (3.2) testing on the dam operation condition data set by using the trained model, and outputting the most likely event type number contained in each sentence according to the possible event types set in advance.

(5.2) if w_iAnd w_jWith a dependency edge and a dependency label of r, initializing A by using an embedded vector of r embedded in a lookup table^dl(i, j), otherwise initializing A using a p-dimensional all-zero vector^dl(i, j), then using the formula

Will depend on the label matrix A^dlConversion to dependency tag score matrix

Wherein U is a trainable weight matrix;

k_i＝U_kh_i,q_i＝U_qh_i,

wherein U is_kAnd U_qIs a trainable weight matrix;

(5.5) it proposes graph transformation attention networks GTANs, which adopt 1 x 1 convolution to the adjacent matrix A set and soft select two intermediate adjacent matrixes Q₁And Q₂Generating a new meta-path graph A by matrix multiplication^lCompare path graph A^lEach channel employs a GAT network and represents the nodes in series as Z, with the following equation:

where, | | is the join operator, C represents the number of channels,

is A^lThe contiguous matrix of the ith channel of (a),

is that

And outputs a predicted argument tag for each character in the specified event.

(7) Constructing a frame comprising an embedding layer, a bidirectional long-short memory layer, a feature extraction layer, a fusion gate layer and a conditional random field layer, and carrying out BIO and CE labeling on the sequence, wherein B represents the beginning of an event argument, I represents the inside of the event argument, C is a reason, E is a result, and O is other words. B-C and I-C sequences are causal events, B-E and I-E sequences are effect events, so that the causal relationship among the events is constructed, and finally the knowledge graph is constructed.

Claims

1. An event map construction method for safe operation of a dam is characterized by comprising the following steps:

(1) converting sentences and documents containing dam safe operation event information into feature vectors by using an ALBERT embedding layer, enhancing Chinese semantic information, and processing the feature vectors converted from each word by using BilSTM;

(2) introducing a local attention simulation event trigger, distributing corresponding weight of each word according to importance degree, taking the word with the highest weight value as a hidden event trigger, introducing keywords and document context information in a global attention learning sentence, obtaining the unique meaning of the trigger in the current scene, and assisting in judging the event type of the sentence;

(3) during the model training process, Focal loss is used as a loss function, the influence of a positive sample and a difficultly-divided sample on the model is enhanced while the problem of sample imbalance is solved, and a prediction event type is output;

(4) connecting event type coding vectors in series behind the feature vectors to form new sentence codes, processing the embedded vectors after connection in series through BilSTM, and capturing context information;

(5) according to a sentence structure generated by dependency syntactic analysis and a semantic vector generated by BilSTM, introducing the characteristics extracted by an attention mode conversion network layer and an attention network layer according to a weight fusion graph to generate a new expression vector, and extracting event arguments in a BIO sequence labeling mode;

(6) judging key events in the key sentences by using the TextCNN, and supplementing missing event roles by using filling words in adjacent sentences to realize missing argument extraction of the events and supplement an event knowledge graph;

2. The dam safety operation oriented event map construction method according to claim 1, wherein in the step (1), sentences and documents containing dam safety operation event information are converted into feature vectors by using an ALBERT embedding layer, Chinese semantic information is enhanced, and the specific steps of processing the feature vectors converted by each word by using BilSTM are as follows:

(1.1) taking the operation records of equipment under the daily working condition and the emergency working condition of the dam as a fine tuning training corpus, then performing fine tuning training on the pre-trained ALBERT model, and converting sentences into feature vectors W in a mathematical form;

And

synthesis of

The sentence context information is represented by the LSTM output vector h.

3. The dam safety operation-oriented event map construction method according to claim 2, wherein a local attention simulation event trigger is introduced in the step (2), a corresponding weight value of each word is distributed according to the importance degree, the word with the highest weight value is taken as a hidden event trigger, keywords and document context information in a global attention learning sentence are introduced, the unique meaning of the trigger in the scene is obtained, and the specific steps of assisting in judging the event type of the sentence are as follows:

is the local attention vector alpha_sThe kth part of (1);

is the global attention vector alpha_dThe kth part of (1);

4. The dam safety operation oriented event graph construction method according to claim 1, wherein in the step (3), Focal local is adopted as a loss function in a model training process, influence of a positive sample and a hard sample on the model is strengthened while the problem of sample unbalance is solved, and specific steps of outputting a predicted event type are as follows:

where x is composed of sentences and target event types, y ∈ {0,1}, o (x)⁽ⁱ⁾) Is the predicted value of the model, | theta | | Y luminance²The sum of squares of all elements in the model, delta & gt 0 is the weight of an L2 normalization term, beta is a parameter of the proportion of positive and negative weights of a balance sample, gamma is a parameter of the proportion of hard-to-classify and easy-to-classify weights of the balance sample, L2 regularization is added in a loss function to prevent the model from being over-fitted, and then the model is trained;

5. The method for constructing an event graph for safe operation of a dam according to claim 4, wherein in the step (4), the event type coding vector is connected in series after the feature vector to form a new sentence code, and the embedded vector after connection is processed by BilSTM, and the specific steps of capturing context information are as follows:

and (3) adding the sentence prediction event type obtained in the step (3) into the sentence coding, and connecting the sentence prediction event type with word embedding, entity type embedding and part of speech tagging in series to form a 312-dimensional vector as an embedded vector which is used as the input of a BilSTM model to obtain a hidden vector sequence as a text semantic structure.

6. The dam safety operation oriented event map construction method according to claim 1, wherein the step (5) introduces the attention mode according to the sentence structure generated by the dependency syntax analysis and the semantic vector generated by BilSTM, integrates the features extracted by the graph conversion network layer and the attention network layer according to the weight, generates a new expression vector, and extracts the event argument through the BIO sequence labeling mode specifically comprises the following steps:

(5.1) use of a dependency tree adjacency matrix A of N^dAs a syntax structure, whenWord w_iAnd w_jWith links in the dependency tree, then A^d(i, j) is set to 1, otherwise 0;

(5.2) if w_iAnd w_jThere is a dependency edge between and the dependency label is r, and a is initialized by using the embedded vector of r embedded in the lookup table^dl(i, j), otherwise initializing A using a p-dimensional all-zero vector^dl(i, j), then using the formula

Will depend on the label matrix A^dlConversion to dependency tag score matrix

Wherein U is a trainable weight matrix;

k_i＝U_kh_i,q_i＝U_qh_i,

wherein U is_kAnd U_qIs a trainable weight matrix;

(5.5) providing graph transformation attention networks GTANs, adopting 1 x 1 convolution to the dependency graph matrix A set, and selecting two intermediate adjacent matrixes Q₁And Q₂Generating a new meta-path graph A by matrix multiplication^lCompare path graph A^lEach channel employs a GAT network and represents the nodes in series as Z, with the following equation:

where | is the join operator, C denotes the number of channels,

is A^lThe contiguous matrix of the ith channel of (a),

is that

And outputs a predicted argument tag for each character in the specified event.

7. The dam safety operation-oriented event map construction method according to claim 1, wherein in the step (6), TextCNN is adopted to judge key events in key sentences, and then filler words in adjacent sentences are used to supplement missing event roles to realize missing argument extraction, and the specific steps of supplementing an event knowledge map are as follows:

(6.1) filling the argument missed by the event, connecting four embedded vectors of argument labels, entity types, sentences and documents in series to form 880-dimensional new vectors, setting 128 vectors with convolution kernel sizes of 3, 4 and 5 for processing input, projecting to 128 dimensions, and then judging whether the sentences contain key events through a pooling and full-connection layer;

and (6.2) for sentences containing key events, calculating the similarity between the rest sentences in the same document and the key sentences by using a MaLSTM model, sequencing, searching for argument roles in adjacent sentences of which the key events correspond to the missing arguments and have the highest similarity, and filling.

8. The dam safety operation oriented event map construction method according to claim 1, wherein the step (7) comprises the steps of:

and (3) labeling the events by using sequence labeling, extracting reason and result arguments, and defining the events to which the reason and result arguments belong as reason events and result events, thereby establishing a causal relationship between the events and constructing an event map.