CN112632978A - End-to-end-based substation multi-event relation extraction method - Google Patents

End-to-end-based substation multi-event relation extraction method Download PDF

Info

Publication number
CN112632978A
CN112632978A CN202011544274.9A CN202011544274A CN112632978A CN 112632978 A CN112632978 A CN 112632978A CN 202011544274 A CN202011544274 A CN 202011544274A CN 112632978 A CN112632978 A CN 112632978A
Authority
CN
China
Prior art keywords
event
data
relationship
multivariate
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011544274.9A
Other languages
Chinese (zh)
Inventor
朱仲贤
蔡科伟
刘文涛
杜瑶
李世民
臧春华
刘鑫
徐蒙福
孔小飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Maintenace Co of State Grid Anhui Electric Power Co Ltd
Overhaul Branch of State Grid Anhui Electric Power Co Ltd
Institute of Advanced Technology University of Science and Technology of China
Original Assignee
Overhaul Branch of State Grid Anhui Electric Power Co Ltd
Institute of Advanced Technology University of Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Overhaul Branch of State Grid Anhui Electric Power Co Ltd, Institute of Advanced Technology University of Science and Technology of China filed Critical Overhaul Branch of State Grid Anhui Electric Power Co Ltd
Priority to CN202011544274.9A priority Critical patent/CN112632978A/en
Publication of CN112632978A publication Critical patent/CN112632978A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of artificial intelligence, and provides a method for extracting a multivariate event relation of a transformer substation based on end-to-end, which comprises the steps of preprocessing original text data to form original corpus data of an event description structure; dividing original corpus data into complete event chunks, constructing a pre-training language model, and training the existing pre-training language model to obtain a trained language model; the method provided by the invention integrates tasks such as event extraction, event causal relationship, condition relationship, parallel relationship and the like together with a labeling method of the multivariate event relationship of the transformer substation corpus, provides an efficient method for the structured extraction of a large amount of professional fault processing corpora of the transformer substation, and simultaneously completely excavates the multivariate relationship among events.

Description

End-to-end-based substation multi-event relation extraction method
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method for extracting a multivariate event relation of a transformer substation based on end-to-end.
Background
The event relation extraction is to extract the relation of events from a text, store the relation in a knowledge database, belong to one of the key technologies of artificial intelligence processing, through a certain semantic relation which is automatically identified between entities, can be divided into binary relation extraction and multivariate relation extraction according to the number of participating entities, and obtain (arg1, relation, arg2) triples by paying attention to the semantic relation between two entities, wherein arg1 and arg2 represent two entities, relation represents the semantic relation between the entities, the event relation extracts data in an event information text to be structured and process the structured data, and the method has wide application prospects in the fields of artificial intelligence, knowledge maps and the like at present.
CN110377756B discloses a method for extracting event relation of mass data set, comprising the following steps: s1: establishing association relation and association strength among the triples according to association rules to form a undirected network; s2: connecting the antecedent word vector, the consequent word vector and the entity type in the triple as the characteristics of the nodes in the undirected network; the method specifically comprises the following steps: s21: respectively extracting the front items and the back items in the triples, and combining the front items and the back items into front item word vectors and back item word vectors; s22: extracting entity types in the triples; s23: encoding the antecedent word vector, the consequent word vector and the entity type in a one-hot mode to serve as the characteristics of the nodes; s3: classifying each node in the undirected network, extracting entity relations in events, and classifying the nodes comprises the following steps: s31: each node sends the feature information of the node to the neighbor nodes after transformation; s32: each node gathers the characteristic information of the neighbor nodes; s33: performing nonlinear transformation after gathering the previous information; s34: the sample data is classified and trained by using the same method as the convolutional neural network.
At present, because events represented by predicates in transformer substation corpora lack too much information in the traditional technology, a method for extracting the causal relationship of the events in anticipation needs a large amount of characteristic engineering, so that the problems of low efficiency, low accuracy caused by conduction errors and the like are caused, and the method also becomes a technical difficulty which needs to be solved urgently in the application process of the traditional event relationship extraction method in the transformer substation field.
Disclosure of Invention
In a large number of tests and practices, because the technical field related to the transformer substation and the knowledge ontology cover a large number of cross-plane fields, and the event missing information expressed by predicates in the linguistic data is too much and is not advisable, the traditional technology only extracts events, time sequence relations, causal relations and the like in isolation, so that conduction errors are caused; and The Chinese disease diagnosis TreeBank marks complete event representation and event relation connection words, but The traditional method taking The cause-effect relation as The main method needs a large amount of characteristic engineering and has low efficiency.
In view of the above, the present invention is directed to a method for extracting a multivariate event relationship of a substation from end to end, the method for extracting a multivariate event relationship of a substation from end to end comprising,
step S1, preprocessing the original text data to form the original corpus data of the event description structure;
step S2, dividing original corpus data into complete event chunks to obtain N fault event groups, wherein N is a positive integer greater than 1, performing data annotation on the N fault event group data, and annotating the N fault event group data in a boundary position-element form to obtain annotated first data;
step S3, constructing a pre-training language model, and training the existing pre-training language model to obtain a trained language model; then, using the static word vectors of the single characters and the multiple characters in the first data as the coding features of input data of a pre-training language model to obtain pre-trained second data;
and step S4, constructing a graph neural network-based event extraction model, taking the second data as input data of the graph neural network-based event extraction model, and extracting and obtaining relationship data between fault event groups.
Preferably, constructing the graph-based neural network event extraction model comprises a BERT model, wherein character levels are used as input to the graph-based neural network event extraction model, and features are integrated into a vector representation of sub-events according to a vocabulary to which each kanji character of a sub-event may be matched.
Preferably, in step S2, the element labeling form in the event includes element position information { B (element start), M (element inside), E (element end), S (single element) }, and all other parts in the event are marked as O.
Preferably, in step S2, the label of the meta-event label includes the relationship between the current meta-event and the next adjacent event;
and subdividing the relation among the N fault event groups, and marking the last meta-event of each event chunk in the same original corpus data as relation data, wherein the relation data comprises the reason that the previous event chunk is the next event chunk, the event chunk is parallel, the event chunk condition and no relation.
Preferably, after the event-based annotation, the characteristics of the event are expressed as:
Figure BDA0002855496230000031
wherein e islFor concatenation of the lexical characteristics of all characters, ecFor character features, forming complete sub-event vector representation features after splicing, wherein VlComprises the following steps:
Figure BDA0002855496230000032
wherein, L is vocabulary set, and z (w) represents word frequency.
Preferably, a pre-trained language model is constructed in step S3, using single-character and multi-character static word vectors as encoding features of the input data.
Preferably, the graph-based neural network event extraction model further comprises a Uni-gram, a Bi-gram, a BilsTM, an Attention and a CRF, a vector coded by BERT + Uni-gram + Bi-gram is input into the neural network BilsTM + Attention + CRF model, sequence tag information is output from the neural network model, and a loss function is established based on the sequence tag information.
Preferably, the test set is used for evaluating the event extraction model based on the graph neural network, and if the evaluation result is lower than a preset target, the step of constructing the event relation model is repeated; and if the evaluation achievement reaches a preset target, terminating the step of constructing the event relation extraction model to obtain the graph-based neural network event extraction model.
Preferably, an event chain diagram is obtained from the transformer substation fault and defect corpus by using a graph-based neural network event extraction model, and then lower event elements of the element events in the event chain diagram are extracted by using the graph-based neural network event extraction model.
According to another aspect of the embodiments of the present invention, there is provided a storage medium, the storage medium including a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the above method.
Compared with the prior art, the end-to-end-based substation multivariate event relation extraction method provided by the invention can realize the technical effects that: preprocessing original text data to form original corpus data of an event description structure; dividing original corpus data into complete event chunks to obtain N fault event groups, wherein N is a positive integer greater than 1, carrying out data annotation on the N fault event group data, and annotating the N fault event group data in a boundary position-element form to obtain annotated first data; constructing a pre-training language model, and training the existing pre-training language model to obtain a trained language model; then, using the static word vectors of the single characters and the multiple characters in the first data as the coding features of input data of a pre-training language model to obtain pre-trained second data; the method provided by the invention integrates tasks such as event extraction, event causal relationship, condition relationship, parallel relationship and the like together with a labeling method of the multivariate event relationship of the transformer substation corpus, provides an efficient method for the structured extraction of a large amount of professional fault processing corpora of the transformer substation, and simultaneously completely excavates the multivariate relationship among events.
Additional features and advantages of the invention will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of an embodiment of a method for end-to-end-based substation multivariate event relationship extraction according to the present invention;
fig. 2 is an event sequence annotation diagram in the end-to-end-based substation multivariate event relationship extraction method of the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances in order to facilitate the description of the embodiments of the invention herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The method aims to solve the problems that in the prior art, the predicate in the transformer substation corpus in the background art indicates that too much event missing information exists, a large number of characteristic projects are needed for a method for extracting the causal relationship of an event in the forecast, the efficiency is low, the accuracy is low due to conduction errors, and the like. The invention provides a method for extracting a multivariate event relation of a transformer substation based on end-to-end, which comprises the following steps of as shown in figure 1,
step S1, preprocessing the original text data to form the original corpus data of the event description structure;
step S2, dividing original corpus data into complete event chunks to obtain N fault event groups, wherein N is a positive integer greater than 1, performing data annotation on the N fault event group data, and annotating the N fault event group data in a boundary position-element form to obtain annotated first data;
step S3, constructing a pre-training language model, and training the existing pre-training language model to obtain a trained language model; then, using the static word vectors of the single characters and the multiple characters in the first data as the coding features of input data of a pre-training language model to obtain pre-trained second data;
and step S4, constructing a graph neural network-based event extraction model, taking the second data as input data of the graph neural network-based event extraction model, and extracting and obtaining relationship data between fault event groups.
The end-to-end-based substation multivariate event relation extraction method provided by the invention can realize the technical effects that: preprocessing original text data to form original corpus data of an event description structure; dividing original corpus data into complete event chunks to obtain N fault event groups, wherein N is a positive integer greater than 1, carrying out data annotation on the N fault event group data, and annotating the N fault event group data in a boundary position-element form to obtain annotated first data; constructing a pre-training language model, and training the existing pre-training language model to obtain a trained language model; then, using the static word vectors of the single characters and the multiple characters in the first data as the coding features of input data of a pre-training language model to obtain pre-trained second data; the method provided by the invention integrates tasks such as event extraction, event causal relationship, condition relationship, parallel relationship and the like together with a labeling method of the multivariate event relationship of the transformer substation corpus, provides an efficient method for the structured extraction of a large amount of professional fault processing corpuses of the transformer substation, improves the structured extraction efficiency of the large amount of professional fault processing corpuses of the transformer substation, and can completely mine the multivariate relationship between events.
In order to better improve the efficiency of the structured extraction process of a large amount of professional fault processing corpora of the substation, for example, in a more preferable case of the present invention, as shown in fig. 2, the professional fault processing corpora of the substation contain a plurality of event descriptions and statements of processing experience and standard, such as: when the local overheating of the iron core is serious, the oil temperature is increased, the oil flash point is reduced, the no-load loss is increased, the insulation is reduced and the like, and combustible gas is separated out to enable the gas relay to act. "
The problem of event relation extraction is converted into the problem of event sequence labeling, a label labeled by the current meta-event is the relation between the current meta-event and the next adjacent event, and meanwhile, different event chunks may exist in a piece of data, and at this time, the last meta-event of each event chunk is labeled as the following conditions: the previous event chunk is the reason for the next event chunk (cau-capacity-end); event chunk juxtaposition (jux-juxtapotion-end); event chunk condition (jux-juxtaposition-end); no relationship (event), etc., these tags are to capture the relationship between non-adjacent meta-events.
In order to better extract the Chinese characters in the text segment and facilitate the processing of a computer, so that the features can become ordered vectors, and the efficiency of the structured extraction process of a large number of professional fault processing corpora of the transformer substation is improved, in the preferred case of the invention, the graph-based neural network event extraction model is constructed to comprise a BERT model, wherein the character level is used as the input of the graph-based neural network event extraction model, the vocabulary possibly matched with each Chinese character of a sub-event is used, and the features are integrated into the vector representation of the sub-event.
In order to better structure the feature data in the text segment, better structure the marked elements in the event, and improve the efficiency of the structured extraction process of a large number of professional fault processing corpora of the substation, in a preferred case of the present invention, in step S2, the marked form of the elements in the event includes element position information { B (element start), M (element inside), E (element end), S (single element) }, and all other parts in the event are marked as O.
For example, for the above mentioned lexical features, word set B consists of all thesaurus matching words beginning with. Similarly, M is composed of all the lexicon match words in the middle, E is composed of all the lexicon match words at the end, all the character vocabulary features are compressed into a vector of fixed dimension, wherein the BMEs are connected into a whole using weighted summation and the weight is the word frequency and the weight is fixed plus a constant. First, each input sentence is scanned with a word stock to obtain four "BME" phrases for each character in the sentence. Second, the frequency of each word is looked up on the statistical data set. A vector representation of the three word sets for each character is then obtained and added to the character representation at the root. And finally, on the basis of the augmented character representation, inputting the augmented character into BilSTM + ATT for sequence modeling, performing label learning on CRF, and finally outputting a result.
In order to better refine the relationship between the current meta-event and the next adjacent event, so as to improve the efficiency of the structured extraction process of a large amount of professional fault processing corpora of the substation, and to be able to completely mine the multivariate relationship between the events, in a preferred case of the present invention, in step S2, the label labeled by the meta-event includes the relationship between the current meta-event and the next adjacent event;
and subdividing the relation among the N fault event groups, and marking the last meta-event of each event chunk in the same original corpus data as relation data, wherein the relation data comprises the reason that the previous event chunk is the next event chunk, the event chunk is parallel, the event chunk condition and no relation.
In a preferred aspect of the present invention, after the event-based annotation, the characteristics of the event are represented as:
Figure BDA0002855496230000081
wherein e islFor concatenation of the lexical characteristics of all characters, ecFor character features, forming complete sub-event vector representation features after splicing, wherein VlComprises the following steps:
Figure BDA0002855496230000082
wherein, L is vocabulary set, and z (w) represents word frequency.
The relationship between the current meta-event and the next adjacent event is better noted, in a more preferred case of the invention, for example, the relationship of the meta-event in each event chunk introduces a set of tags: { jux-juxtaposition, if-condition, same-c-end, pro-extension, event, cau-occupancy, com-juxtaposition, pur-occupancy, re-occupancy, same-o-end, jux-queue, seq-queue, subst-condition, sd-composition, exp-extension, exc-extension, gen-extension, repeat-queue, re-composition, all-condition, re-pur-occupancy, re-if-condition, mut-juxtaposition, exa-extension, new-condition, trans-ply-relationship, etc-area, 25-area-jux, 25-relationship, jux-unit, etc.: equivalent juxtaposition, com-juxtapoion: possible juxtapositions, mut-juxtaposition: mutual exclusion juxtaposition, if-condition: assuming the condition, reverse-if-condition: the opposite word order hypothesis condition; suffi-condition: a sufficient condition relationship; nec-condition: a relationship of the necessary conditions; cau-cause and effect relationships; reverse-cause: the language order is opposite and causal; pur-cause: the purpose is causal; jux-sequential: simultaneous compliance (distinguished from jux-juxtaposition, where the equal side-by-side nature of events is de-emphasized, but the order of occurrence of events is emphasized); seq-sequential: sequentially bearing; reverse-sequential: the language orders are opposite and sequential; sd-match: comparing in the same direction; re-match: carrying out reverse comparison; trans-match: a turning relationship; exp-extended: interpreting the relationship; exa-extended: instance relationships; gen-extended: summarizing the relationship; pro-extended: a progressive relationship; exa-extended: an exception relationship; same-o-end (same subject); same condition for same; event (irrelevant).
In order to better use the static word vectors of single characters and multiple characters as input parameters to the model and obtain a mature language model, thereby further effectively processing the static word segments of single characters and multiple characters and improving the processing efficiency, in the preferred case of the invention, a pre-training language model is constructed in step S3, and the static word vectors of single characters and multiple characters are used as the coding features of the input data.
In order to improve the processing efficiency of the structured extraction process of a large number of professional fault processing corpora of the transformer substation, under the preferable condition of the invention, the graph-based neural network event extraction model further comprises a Uni-gram, a Bi-gram, a BilSTM, an Attestation and a CRF, a vector coded by BERT + Uni-gram + Bi-gram is input into a neural network BilSTM + Attestation + CRF model, sequence label information is output from the neural network model, and a loss function is established based on the sequence label information.
For example, the long-term and short-term memory network LSTM can learn long-term dependency, compared with RNN, LSTM adds forgetting gate to solve the problems of gradient explosion and gradient disappearance during training, and GRU accelerates the operation speed while ensuring the LSTM effect, and inputs the vector sequence x of the second datatAnd inputting the data into a GRU (general purpose Unit) for bidirectional encoding and decoding:
and (3) encoding:
Figure BDA0002855496230000101
wherein
Figure BDA0002855496230000102
Word vectors for the t-1 th character, θencFor training the parameters, the length of the character is
And (3) decoding:
Figure BDA0002855496230000103
Figure BDA0002855496230000104
Figure BDA0002855496230000105
the operation in both directions is as follows:
Figure BDA0002855496230000106
Figure BDA0002855496230000107
the Attention mechanism can be described as a similarity mapping from a query (denoted by q) to a key-value pair (denoted by k, v), and the Self-Attention mechanism is that k is v, and the calculation process is as follows: the first step is to apply the original input vector xtLinear mapping is carried out to q, k and v, in the second step, similarity calculation is carried out on q and k to obtain weights, then the weights are divided by the square root of a hidden layer dimension, common attention correlation calculation functions comprise a dot product, a perceptron and the like, in the third step, softmax function is used for normalizing the weights, then the weights and v are subjected to weighted summation to obtain a final attention matrix,
Figure BDA0002855496230000108
wherein xt∈RnxdIs linearly mapped to be q, k, v epsilon to Rnxd,dkIs the dimension of the hidden layer.
The method utilizes the Attention to calculate the expression of the output result of the LSTM, learns the long-distance dependency relationship, and obtains richer information output by the hidden layer:
the hidden layer output state at each time instant of LSTM is (n is the length of the sequence):
Figure BDA0002855496230000111
in the decoding process of the LSTM at t moment, the last moment is used
Figure BDA0002855496230000112
The hidden state is used as a q vector mentioned by an attention mechanism to calculate an attention matrix of the hidden state of all coding layers at the time t-1, so that the condition that the coding layer of the LSTM can not code enough information by fixing the hidden layer dimension is avoided:
Figure BDA0002855496230000113
where f is an attention correlation calculation function.
Then, the at calculated from the output sequence of the encoding layer is taken as the input of the t-th layer of the decoding layer:
Figure BDA0002855496230000114
wherein,
Figure BDA0002855496230000115
for embedded representation of tags, θdecAre parameters of the decoding layer.
In order to better extract a large amount of professional fault processing linguistic data of the transformer substation in a structured mode, under the preferable condition of the invention, after vector representation of event group sentences learned by the above layers, softmax calculation is carried out to obtain the most probable labels at each position, but dependence among output labels is not considered, for example, the label of 'B-event relation' cannot be followed by 'B-event relation', so that CRF is used for capturing the dependence among the labels, and output is more reasonable:
by comparing conditional probabilities P (Y | X)Modeling is performed, where X ═ X1,...,xnY is the input sequence, Y ═ Y1,...,ynIs the corresponding marker sequence:
Figure BDA0002855496230000116
wherein
Figure BDA0002855496230000121
tkFor a transition feature at i, the corresponding weight is λkEach of yiThere are k features, which is an advantage over the state feature that is only considered by LSTM, CRF allows for transfer features between tags. slIs a state feature at i, with a corresponding weight of μlEach of yiThere are l features, and Z (x) is a normalization factor.
The Uni-gram uses a gigaword chn. all.a2b. Uni. ite50.vec character vector, which contains 11327 character vectors; lexicon comprises Gigaword _ chn.all.a.2b.bi.ite50. vec and ctb.50d.vec vectors, the Gigaword _ chn.all.a.2b.uni.ite50. vec is a vector set which is trained by using a Word2vec tool based on a large-scale standard participled Chinese corpus, the vector set is 704400 characters and words, the vector set comprises 5700 single-character vectors, 29150 double-character vectors and 278100 three-character vectors, and the Word vectors ctb.50d.vec are obtained by training based on a CTB 6.0(Chinese Treebank 6.0) corpus; BERT uses a version of the pitorch-BERT-base-Chinese, uses corpora of simplified and traditional Chinese in Wikipedia, and is character-level trained using 12-768 hidden-level dimensions, 12 heads, 110M parameters.
In order to better converge data in the neural network and improve the operational efficiency of the model so as to extract better structuralization, under the preferable condition of the invention, a test set is used for evaluating the event extraction model based on the neural network, and if the evaluation result is lower than a preset target, the step of constructing the event relation model is repeated; and if the evaluation achievement reaches a preset target, terminating the step of constructing the event relation extraction model to obtain the graph-based neural network event extraction model.
In order to better visualize and present the process and the result of the multivariate event extraction and obtain the event chain diagram of the transformer substation fault and defect corpus, under the preferred condition of the invention, the event chain diagram is obtained from the transformer substation fault and defect corpus by adopting a graph-based neural network event extraction model, and then the lower event elements of the meta-events in the event chain diagram are extracted by adopting the graph-based neural network event extraction model.
The embodiment of the invention also provides a storage medium, which comprises a stored program, wherein when the program runs, the device where the storage medium is located is controlled to execute the method.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus can be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a mobile terminal, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The method for extracting the end-to-end-based substation multivariate event relation is characterized by comprising the following steps of,
step S1, preprocessing the original text data to form the original corpus data of the event description structure;
step S2, dividing original corpus data into complete event chunks to obtain N fault event groups, wherein N is a positive integer greater than 1, performing data annotation on the N fault event group data, and annotating the N fault event group data in a boundary position-element form to obtain annotated first data;
step S3, constructing a pre-training language model, and training the existing pre-training language model to obtain a trained language model; then, using the static word vectors of the single characters and the multiple characters in the first data as the coding features of input data of a pre-training language model to obtain pre-trained second data;
and step S4, constructing a graph neural network-based event extraction model, taking the second data as input data of the graph neural network-based event extraction model, and extracting and obtaining relationship data between fault event groups.
2. The end-to-end based substation multivariate event relationship extraction method of claim 1, wherein constructing the graph-based neural network event extraction model comprises a BERT model, wherein character levels are used as input to the graph-based neural network event extraction model, words that can be matched according to each kanji character of a sub-event are used, and features are integrated into a vector representation of the sub-event.
3. The method for end-to-end-based substation multivariate event relationship extraction according to claim 1, wherein in step S2, the element labeling form in the event comprises element position information { B (element start), M (element inside), E (element end), S (single element) }, and all other parts in the event are marked as O.
4. The method for end-to-end based substation multivariate event relationship extraction according to claim 3, wherein in step S2, the label of the meta-event label comprises the relationship between the current meta-event and the next adjacent event;
and subdividing the relation among the N fault event groups, and marking the last meta-event of each event chunk in the same original corpus data as relation data, wherein the relation data comprises the reason that the previous event chunk is the next event chunk, the event chunk is parallel, the event chunk condition and no relation.
5. The method for extracting the end-to-end-based substation multivariate event relationship according to claim 3, wherein after the event is labeled, the characteristics of the event are represented as:
Figure FDA0002855496220000021
Figure FDA0002855496220000022
wherein e islFor concatenation of the lexical characteristics of all characters, ecFor character features, forming complete sub-event vector representation features after splicing, wherein VlComprises the following steps:
Figure FDA0002855496220000023
wherein, L is vocabulary set, and z (w) represents word frequency.
6. The end-to-end-based substation multivariate event relationship extraction method according to claim 1, wherein a pre-trained language model is constructed in step S3, and single-character and multi-character static word vectors are used as encoding features of input data.
7. The end-to-end-based substation multivariate event relationship extraction method according to claim 1, wherein the graph-based neural network event extraction model further comprises a Uni-gram, a Bi-gram, a BiLSTM, an attach, and a CRF, a vector encoded by BERT + Uni-gram + Bi-gram is input into the neural network BiLSTM + attach + CRF model, sequence tag information is output from the neural network model, and a loss function is established based on the sequence tag information.
8. The end-to-end-based substation multivariate event relationship extraction method according to claim 7, characterized in that a graph-based neural network event extraction model is evaluated by using a test set, and if the evaluation result is lower than a predetermined target, the step of constructing the event relationship model is repeated; and if the evaluation achievement reaches a preset target, terminating the step of constructing the event relation extraction model to obtain the graph-based neural network event extraction model.
9. The end-to-end-based substation multivariate event relation extraction method according to any one of claims 1-8, characterized in that a graph-based neural network event extraction model is adopted to obtain an event chain map from substation fault and defect corpora, and then a graph-based neural network event extraction model is adopted to extract lower event elements of meta-events in the event chain map.
10. A storage medium, characterized in that the storage medium comprises a stored program, wherein the program, when executed, controls an apparatus in which the storage medium is located to perform the method of any of claims 1-9.
CN202011544274.9A 2020-12-23 2020-12-23 End-to-end-based substation multi-event relation extraction method Pending CN112632978A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011544274.9A CN112632978A (en) 2020-12-23 2020-12-23 End-to-end-based substation multi-event relation extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011544274.9A CN112632978A (en) 2020-12-23 2020-12-23 End-to-end-based substation multi-event relation extraction method

Publications (1)

Publication Number Publication Date
CN112632978A true CN112632978A (en) 2021-04-09

Family

ID=75322085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011544274.9A Pending CN112632978A (en) 2020-12-23 2020-12-23 End-to-end-based substation multi-event relation extraction method

Country Status (1)

Country Link
CN (1) CN112632978A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282726A (en) * 2021-05-27 2021-08-20 成都数之联科技有限公司 Data processing method, system, device, medium and data analysis method
CN113484693A (en) * 2021-07-30 2021-10-08 国网四川省电力公司电力科学研究院 Transformer substation secondary circuit fault positioning method and system based on graph neural network
WO2024164728A1 (en) * 2023-02-10 2024-08-15 天翼云科技有限公司 Knowledge graph event extraction method and apparatus, device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108170892A (en) * 2017-11-30 2018-06-15 中国航空综合技术研究所 A kind of fault modes and effect analysis method that emulation is deduced based on accident dynamic
CN111459131A (en) * 2020-03-04 2020-07-28 辽宁工程技术大学 Method for converting causal relationship text of fault process into symbol sequence
CN111694924A (en) * 2020-06-17 2020-09-22 合肥中科类脑智能技术有限公司 Event extraction method and system
WO2020211275A1 (en) * 2019-04-18 2020-10-22 五邑大学 Pre-trained model and fine-tuning technology-based medical text relationship extraction method
CN111897908A (en) * 2020-05-12 2020-11-06 中国科学院计算技术研究所 Event extraction method and system fusing dependency information and pre-training language model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108170892A (en) * 2017-11-30 2018-06-15 中国航空综合技术研究所 A kind of fault modes and effect analysis method that emulation is deduced based on accident dynamic
WO2020211275A1 (en) * 2019-04-18 2020-10-22 五邑大学 Pre-trained model and fine-tuning technology-based medical text relationship extraction method
CN111459131A (en) * 2020-03-04 2020-07-28 辽宁工程技术大学 Method for converting causal relationship text of fault process into symbol sequence
CN111897908A (en) * 2020-05-12 2020-11-06 中国科学院计算技术研究所 Event extraction method and system fusing dependency information and pre-training language model
CN111694924A (en) * 2020-06-17 2020-09-22 合肥中科类脑智能技术有限公司 Event extraction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
魏优;刘茂福;胡慧君;: "基于深层语境词表示与自注意力的生物医学事件抽取", 计算机工程与科学, no. 09, 15 September 2020 (2020-09-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282726A (en) * 2021-05-27 2021-08-20 成都数之联科技有限公司 Data processing method, system, device, medium and data analysis method
CN113484693A (en) * 2021-07-30 2021-10-08 国网四川省电力公司电力科学研究院 Transformer substation secondary circuit fault positioning method and system based on graph neural network
CN113484693B (en) * 2021-07-30 2023-04-07 国网四川省电力公司电力科学研究院 Transformer substation secondary circuit fault positioning method and system based on graph neural network
WO2024164728A1 (en) * 2023-02-10 2024-08-15 天翼云科技有限公司 Knowledge graph event extraction method and apparatus, device, and storage medium

Similar Documents

Publication Publication Date Title
CN106980683B (en) Blog text abstract generating method based on deep learning
CN110196906B (en) Deep learning text similarity detection method oriented to financial industry
CN111737496A (en) Power equipment fault knowledge map construction method
CN111325029B (en) Text similarity calculation method based on deep learning integrated model
Roshanfekr et al. Sentiment analysis using deep learning on Persian texts
CN112632978A (en) End-to-end-based substation multi-event relation extraction method
CN113569001A (en) Text processing method and device, computer equipment and computer readable storage medium
Ayishathahira et al. Combination of neural networks and conditional random fields for efficient resume parsing
CN110457585B (en) Negative text pushing method, device and system and computer equipment
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
Wadud et al. Word embedding methods for word representation in deep learning for natural language processing
CN113282729B (en) Knowledge graph-based question and answer method and device
Moreira et al. Distantly-supervised neural relation extraction with side information using BERT
Qiu et al. Chinese Microblog Sentiment Detection Based on CNN‐BiGRU and Multihead Attention Mechanism
Sun et al. Automatic text summarization using deep reinforcement learning and beyond
CN113449508B (en) Internet public opinion correlation deduction prediction analysis method based on event chain
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
CN113361259B (en) Service flow extraction method
Vo Se4exsum: An integrated semantic-aware neural approach with graph convolutional network for extractive text summarization
Behere et al. Text summarization and classification of conversation data between service chatbot and customer
CN114154505B (en) Named entity identification method oriented to power planning review field
CN116720519A (en) Seedling medicine named entity identification method
Yu et al. Multi-module Fusion Relevance Attention Network for Multi-label Text Classification.
Louis NetBERT: a pre-trained language representation model for computer networking
CN116127097A (en) Structured text relation extraction method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination