CN112541341A

CN112541341A - Text event element extraction method

Info

Publication number: CN112541341A
Application number: CN202011510822.6A
Authority: CN
Inventors: 苏华权; 周昉昉; 廖鹏; 蔡雄; 易仕敏; 彭泽武; 杨秋勇
Original assignee: Guangdong Power Grid Co Ltd
Current assignee: Guangdong Power Grid Co Ltd
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2021-03-23

Abstract

The invention discloses a text event extraction method, which relates to the technical field of computers, and is characterized in that a first sequence trained by text input is labeled with a BERT model to obtain a plurality of trigger words, the plurality of trigger words and a second sequence trained by the text where the plurality of trigger words are located are labeled with the BERT model to obtain event elements corresponding to the plurality of trigger words, and an event element set is generated, so that the applicability and the accuracy of event extraction are improved.

Description

Text event element extraction method

Technical Field

The invention relates to the technical field of computers, in particular to a text event element extraction method.

Background

Event element extraction is one of basic tasks in the field of natural language processing and is also an important subtask in an information extraction task. Event element extraction is intended to extract the most important event elements in a text, and the specific main work is to identify event elements which occur and various elements in the event elements from a piece of text. For example, a trigger word and an event element in one text are extracted, and the event element comprises an event subject, an event object, time, place and the like.

The existing event element extraction scheme mainly uses a self-defined trigger word and extracts event elements based on a machine learning mode, and converts the event element extraction process into a classification problem.

Disclosure of Invention

In order to solve the defects of the prior art, an embodiment of the present invention provides a text event element extraction method, including the following steps:

marking a BERT model on a first sequence trained by text input to obtain a plurality of trigger words;

and inputting the plurality of trigger words and the text where the plurality of trigger words are located into a trained second sequence and labeling a BERT model to obtain event elements corresponding to the plurality of trigger words and generate an event element set, wherein the event elements comprise an event subject, an event object, time and a place.

Preferably, after generating the set of event elements, the method further comprises:

obtaining a syntactic dependency relationship between the trigger word and each event element by utilizing an ltp model of a language technology platform;

and respectively judging whether each event element is correct or not according to the syntactic dependency relationship.

Preferably, according to the syntactic dependency, respectively determining whether each event element correctly includes:

and when the syntactic dependency relationship is the subject of the dominance relationship, manually judging whether the corresponding event body in the event element set is really the event body in the text, and if not, filtering the event elements.

Preferably, respectively determining whether each event element is correct according to the syntactic dependency further includes:

and when the syntactic dependency relationship is a subject of the actor-guest relationship, manually judging whether the corresponding event object in the event element set is really an event object in the text, and if not, filtering the event element.

Preferably, the training process of the sequence labeling BERT model includes:

and inputting a plurality of sentence-level texts carrying trigger word labels as training data into a sequence labeling BERT model, and training the sequence labeling BERT model to obtain a trained first sequence labeling BERT model.

Preferably, the training process of the second sequence labeling BERT model includes:

and adding a CRF layer of the conditional random field CRF model to the trained sequence label BERT model to obtain a trained second sequence label BERT model.

Preferably, the training process of the sequence labeling BERT model includes:

and taking a plurality of sentence-level texts carrying event element labels as training data to input a sequence label BERT model, and training the sequence label BERT model to obtain a trained second sequence label BERT model.

The text event element extraction method provided by the embodiment of the invention has the following beneficial effects:

the method has the advantages that the trigger words are predicted by marking the BERT model through the trained first sequence, the event elements are predicted by marking the BERT model through the trained second sequence, the method is suitable for linguistic data of various sources, and the accuracy rate of extracting the event elements is high.

Detailed Description

The present invention will be described in detail with reference to the following embodiments.

The text event element extraction method provided by the embodiment of the invention comprises the following steps:

s101, marking the first sequence trained by text input to a BERT model to obtain a plurality of trigger words.

Wherein the first sequence labeling BERT model utilizes an encoder structure of a Transformer model. The Transformer model is an attention mechanism and can learn the context relationship between words in a text. The prototype of the Transformer model comprises two independent structures, wherein one structure is an encoder structure and is responsible for receiving a text as an input; a decoder structure responsible for predicting the outcome of the task.

And S102, inputting the plurality of trigger words and the text where the plurality of trigger words are located into the trained second sequence and labeling the BERT model to obtain event elements corresponding to the plurality of trigger words and generate an event element set, wherein the event elements comprise an event subject, an event object, time and a place.

The specific process for extracting the event elements comprises the following steps:

the embedding layer of the second sequence labeling BERT model converts an input text into three embedding characteristics of sub-word embedding, position embedding and segmentation embedding, the position of a trigger word in the sub-word embedding characteristics is replaced by 1, and the coding layer constructs a vector representation representing the semantics of each character to be classified based on the semantic vector of the sub-word output by the embedding layer. And the output layer finally inputs the vector representation corresponding to each word into a full-connection layer for multi-classification, and the class with the highest probability is taken as the classification mark of the word.

Optionally, after generating the set of event elements, the method further comprises:

Optionally, according to the syntactic dependency, it is respectively determined whether each event element correctly includes:

and when the syntactic dependency relationship is the subject of the primary predicate relationship, manually judging whether the corresponding event body in the event elements is really the event body in the text, and if not, filtering the event elements.

Optionally, the determining whether each event element is correct according to the syntactic dependency relationship further includes:

and when the syntactic dependency relationship is the subject of the actor-guest relationship, manually judging whether the corresponding event object in the event element is really the event object in the text, and if not, filtering the event element.

As a specific embodiment, for a text that "north june aggregates 100 airplanes in iraq," the first sequence is labeled with BERT model to predict "aggregation" as a trigger, the first sequence is labeled with BERT model to predict "north june" as an event subject and "100 airplanes" as event objects, and then the ltp model is used to judge that the syntactic dependency relationship between "aggregation" and "north june" is a major-minor relationship and the syntactic dependency relationship between "airplanes" is a guest relationship, and then "north june" is confirmed as an event subject and "100 airplanes" as event objects.

Optionally, the training process of the sequence labeling BERT model includes:

and inputting a plurality of sentence-level texts carrying trigger word labels as training data into the sequence label BERT model, and training the sequence label BERT model to obtain a trained first sequence label BERT model.

Optionally, the training process of the second sequence labeling BERT model includes:

Wherein the CRF layer of the conditional random field CRF model is used to learn the relationships between different labels, rather than making independent predictions.

Optionally, the training process of the sequence labeling BERT model includes:

and inputting a plurality of sentence-level texts carrying event element labels as training data into the sequence label BERT model, and training the sequence label BERT model to obtain a trained second sequence label BERT model.

As a specific embodiment, the labeled data is processed into a sequence labeled format by using an IOB label, wherein the label I is used for identifying characters in a text block, the label O is used for identifying characters outside the text block, and the label B is used for identifying the first character of the same type of text block which is followed by the text block.

According to the text event element extraction method provided by the embodiment of the invention, a plurality of trigger words are obtained by labeling the first sequence trained by text input to the BERT model, a plurality of trigger words and the second sequence trained by the text where the trigger words are located are labeled to the BERT model, event elements corresponding to the trigger words are obtained, an event element set is generated, and the applicability and the event element extraction accuracy are improved.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It will be appreciated that the relevant features of the method and apparatus described above are referred to one another.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In addition, the memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A text event element extraction method is characterized by comprising the following steps:

2. The textual event element extraction method of claim 1, wherein after generating the set of event elements, the method further comprises:

3. The method of claim 2, wherein determining whether each event element is correct according to the syntactic dependency comprises:

and when the syntactic dependency relationship is the subject of the dominance relationship, manually judging whether the corresponding event body in the event elements is really the event body in the text, and if not, filtering the event elements.

4. The method of extracting textual event elements according to claim 2, wherein determining whether each of the event elements is correct according to the syntactic dependency further comprises:

and when the syntactic dependency relationship is a subject of the actor-guest relationship, manually judging whether the corresponding event object in the event elements is really the event object in the text, and if not, filtering the event elements.

5. The method of extracting text event elements according to claim 1, wherein the training process of the sequence labeling BERT model comprises:

6. The text event element extraction method of claim 1, wherein the training process of the second sequence labeling BERT model comprises:

7. The method of extracting text event elements according to claim 6, wherein the training process of the sequence labeling BERT model comprises: