CN115544212A

CN115544212A - Document-level event element extraction method, apparatus and medium

Info

Publication number: CN115544212A
Application number: CN202211344142.0A
Authority: CN
Inventors: 廖泓舟; 代翔; 戴礼灿; 潘磊; 张武; 彭晓; 胡艳霞
Original assignee: CETC 10 Research Institute
Current assignee: CETC 10 Research Institute
Priority date: 2022-10-31
Filing date: 2022-10-31
Publication date: 2022-12-30

Abstract

The invention discloses a document-level event element extraction method, device and medium, belonging to the field of document-level event element extraction and comprising the following steps: obtaining sentence vectors, splicing the sentence vectors to obtain initial chapter vectors, sending the sentence vectors into an attention mechanism network, obtaining chapter vectors containing implicit relations between sentences, and performing mixed representation on the two vectors to obtain final text chapter representations; event sentence identification and element extraction, central sentence identification and event element and cross-sentence event relation extraction are carried out; carrying out mathematical statistics on the correlation between the central sentences and the events, and finishing central sentence screening based on a competitive principle; and completing the element completion of the central event based on the element completion model to obtain a complete document-level event element. The invention has the advantages of completeness and practicability.

Description

Document-level event element extraction method, device and medium

Technical Field

The present invention relates to the field of document-level event element extraction, and more particularly, to a method, an apparatus, and a medium for extracting document-level event elements.

Background

With the vigorous development of internet technology, a large amount of data information, such as current news, company financial reports, electronic cases and the like, is generated every day in a network space, so that the way for people to acquire information is greatly enriched. The network information has various carrier forms, such as texts, pictures, audio and video, and the like, but the proportion of the texts is the highest. Through reading the text, people can learn the time, the place, the object, the action and other factors of the event, so that the leading-edge problem in the current social and economic development is known, and the knowledge reserve is expanded. However, incremental development of information provides convenience to humans, and also provides challenges to fully comprehend the vast amount of knowledge. How to efficiently screen out the concerned event knowledge from massive unstructured texts becomes a problem to be solved urgently by human in the information explosion era.

In recent years, with the rapid progress of technologies related to natural language processing, event extraction technologies have become one of the research hotspots in the field of information processing. The core task of event extraction is to extract element information of an event from an unstructured natural language text by using a computer and to express the element information in a semi-structured or structured form. On one hand, the event extraction can be realized by automatically filtering out the event information concerned by the user for the human, so that the efficiency of obtaining useful information by the human is greatly improved. On the other hand, the event element information stored in the structured form is more convenient for a computer to understand and process, data bases of natural language processing related applications are further laid, and the computer can provide high-level services such as machine question answering, information association retrieval, event reasoning analysis, intelligent creation and the like for human beings in a mode of constructing a knowledge graph and the like by using the structured event information. For a long time, governments need to know the outbreak and evolution of social trending events in time and respond accordingly in time, and therefore event detection and monitoring have been the central focus of government public affair management. Businesses in the business and financial fields need to quickly discover market reactions for products to infer signals for risk analysis and trading advice, which may also rely on event extraction techniques. Event extraction techniques can be used in the biomedical field to identify changes in the state of or interactions between biomolecules for understanding disease and physiological mechanisms. In a word, the event extraction technology can benefit many fields, and has extremely strong practical significance and good application prospect.

With the continuous expansion of various event extraction applications at home and abroad and the continuous progress of scientific technology, the popularity of related research of event extraction is increasing year by year. The research is mainly divided into two aspects of event element extraction based on pattern matching and event element extraction based on deep learning. The method based on pattern matching needs to mine the association characteristics of the text context arguments under different event types, formulate a corresponding pattern matching template, and further realize event extraction through pattern matching. The formulation mode of the event template is developed from three stages of learning the template from the artificial pre-classified corpus and then automatically learning the event template by utilizing the knowledge base based on the artificial construction, and the aim is to continuously reduce the workload of artificial participation. For example, riloff E and the like consider that the most important fact of an event often appears in the first description of the event, and the adjacent text of the event usually contains the description of the event role, so that an AutoSlog event extraction system is developed by utilizing an automatically constructed trigger word dictionary, and a better effect is realized on an MUC-4 corpus; surdenu M et al propose an event extraction mode that realizes domain independence based on a structure that automatically recognizes predicate arguments. In recent years, with the rapid development of deep learning technology, the development of event extraction technology has been promoted indirectly, and the development mainly includes several research directions such as CNN, RNN, GNN, complex neural network, attention mechanism, and the like. For example, first, the Nguyen and Grishman rates use the CNN for event detection tasks, they predict event trigger words and event types in sentences using the CNN, each character in the sentences is first converted into a representation vector of a real numerical value, the representation vector is formed by splicing word vectors, position vectors and entity type vectors, and then is used as input of the CNN network, and each character outputs a classification result of an event type through the CNN; then, nguyen et al propose a two-way RNN structure for extracting combined events, the combined extraction process comprises an encoding stage and a prediction stage, the encoding stage uses binary vectors to replace position features to represent dependency features so as to jointly predict event trigger words and arguments, and the prediction stage divides the dependency between the trigger words and the arguments into the dependency between trigger word sub-types, the dependency between argument roles and the dependency between the trigger word sub-types and argument roles; rao et al, which use a semantic analysis technique, abstract Meaning Representation (AMR), consider the event structure as a sub-graph of an AMR graph structure, define the event extraction task as an AMR sub-graph recognition task, and train a graph LSTM model to recognize the event sub-graph; secondly, the above-mentioned event extraction method has merits and disadvantages in capturing text features, relations and dependency relations, so there are many research works to use different neural networks to learn different types of features by combining different neural networks, and to deal with the event extraction task by using the advantages of each other, wherein t.h.nguyen first obtains an initial word feature vector by using a Bi-LSTM model, then applies a graph convolution network to extract events, and in addition, researchers propose to combine CNN and RNN to extract events; finally, attention force mechanisms in the computer vision field are increasingly used in cross-fields, and are used in natural language processing tasks, wu et al apply argument information to train Bi-LSTM networks with attention mechanisms, wei et al identify event elements by using a self-attention-based multi-classification model, and meanwhile pay attention to more important part of contents in sentences and capture different features in the sentences, so that identification of the event elements is enhanced.

In summary, the progress of deep learning theory has promoted great progress in event extraction related research, but existing research still has many places disjointed from real application scenes. On one hand, limited by the difficulty of an event extraction data set and a task, the existing event extraction technology mostly assumes that all element information of an event appears in a single sentence, and limits the extracted granularity to the sentence level. However, in reality, due to the flexibility of natural language expression, all the element information of an event is often distributed in multiple sentences in a document, and there are cases where multiple sentences share the same element. Event extraction at sentence granularity is obviously difficult to meet the real event extraction requirement.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a document level event element extraction method, device and medium, which have the advantages of completeness, practicability and the like.

The purpose of the invention is realized by the following scheme:

a document-level event element extraction method comprises the following steps:

s1, sentence segmentation is carried out on a text of a chapter to obtain sentence fragments, character coding, position coding and type coding are carried out on the sentence fragments to obtain coding vectors C, L and S respectively; the code vectors C, L and S are overlapped to obtain sentence-level initialization vectors, and the sentence-level initialization vectors are sent to a pre-training model to obtain a sentence vector V _i The pre-training model comprises a Bert pre-training model; vector each sentence into V _i Splicing to obtain an initial chapter vector V _initial (ii) a Meanwhile, the sentence-level initialization vector is sent to an attention mechanism network, different sentences are endowed with different weight values, and a chapter vector V containing implicit relations between the sentences is obtained _relation (ii) a Will V _initial And V _relation Performing superposition mixing characterization to obtain a final text discourse representation V _text ；

S2, the text discourse representation V based on the step S1 _text Classifying sentence vectors in a sequence labeling mode to complete event sentence identification and element extraction, classifying chapter vectors to complete central sentence identification and element extraction, and simultaneously obtaining event correlation;

s3, carrying out mathematical statistics on the correlation between the central sentences and the events, and finishing central sentence screening based on a competitive principle;

and S4, completing central event element completion based on the element completion model to obtain complete document-level event elements.

Further, in step S1, the character encoding, the position encoding, and the type encoding are performed on the sentence fragments, which specifically includes: and performing character coding on characters in each sentence fragment based on the character table, performing position coding on the position difference of the characters, and endowing the current sentence with a type code.

Further, in step S2, the classifying sentence vectors by a sequence labeling manner to complete event sentence identification and element extraction specifically includes the steps of: the event sentence marks are represented by 0 and 1, 0 represents a non-event sentence, 1 represents an event sentence, the event element marks are represented by B-I-O, B represents an element start bit, I represents other bits of the element, O represents a non-element bit, subscripts respectively represent subjects, objects, time, places and trigger words by using sub, obj, tim, loc and tri, the event sentence is identified by carrying out full-connection linear change on a [ CLS ] vector in a sentence vector and then detecting whether the event sentence is an event sentence or not by softmax binary classification; element extraction is realized by performing full-connection linear change on each character bit vector in the sentence vector, detecting the flag bit of each character bit vector through softmax two-classification, and finally obtaining whether the current sentence is an event sentence or not and specific event element information.

Further, in step S2, the classifying the discourse vectors to complete the central sentence recognition and the element extraction, and obtain the event correlation at the same time, specifically includes the steps of: the central sentence is labeled by adopting 0 and 1, the event relation is labeled by 0,1,2 and 3, 0 represents no relation, 1 represents a sequential relation, 2 represents an association relation, 3 represents a causal relation, the central sentence identification detects whether the central sentence is the target sentence or not by carrying out full-connection linear change on each [ CLS ] vector in the chapter vectors and carrying out softmax secondary classification;

and extracting the relationship among the events by carrying out full-connection linear change on [ CLS ] vectors in pairs in chapter vectors and detecting the affiliated relationship of the vectors by softmax four classifications.

Further, in step S3, the sub-steps of:

traversing and selecting each central sentence, counting the number of event sentences having hidden relations with the central sentences through a mathematical method, selecting the central sentences containing the most hidden relations as the unique central sentences of the text through a competitive principle, namely the more hidden relations are, the more prominent the central thought is, if all the central sentences do not have the hidden relations or the relations of the central sentences and the central sentences are consistent in number, defaulting to select the central sentences positioned at the front as the unique central sentences, if the number of the hidden relations is the same, defaulting to sort according to the priority level containing causal relations, association relations and sequential bearing relations, and selecting the central sentences containing the most hidden relations as the unique central sentences of the text.

Further, in step S4, the completing the element completion of the central event based on the element completion model specifically includes the following substeps:

s41, defining an event element universal template, wherein the event element universal template comprises five universal argument roles of a subject, an object, time, a place and a trigger word; constructing an event example sample, carrying out element reconfiguration on the finally confirmed central sentence elements according to the universal template sample, and simultaneously carrying out missing position supplement by using the elements of the central sentence correlated event sentences to form positive and negative samples;

and S42, splicing the positive and negative samples with the original text respectively to form new samples, and then carrying out chapter-level event classification.

Further, in step S42, the sorting of discourse-level events specifically includes the sub-steps of:

s421, performing text chapter representation to obtain chapter-level characteristics;

s422, probability of 0 and 1 is obtained through softmax two-classification, wherein 1 represents that the sample is reasonable, the element supplement is correct, 0 represents that the sample is unreasonable, and the element supplement is wrong, wherein the chapter-level event classification is obtained by pre-training positive and negative samples constructed in advance on the basis of text chapter representation.

Further, the text discourse representation V _text The entire text chapter vector representation including the semantics of the sentence itself and the semantics of the sentence context.

A computer device comprising a processor and a memory, the memory having stored therein a computer program which, when loaded by the processor and executed, carries out the method of any of the preceding claims.

A readable storage medium, in which a computer program is stored, which computer program is loaded by a processor and executes a method according to any of the above.

The beneficial effects of the invention include:

(1) The technical scheme of the invention has the advantage of completeness, the semantic information of the sentences, the context information of the sentences, the semantic information between the sentences and the semantic information of the whole chapter are comprehensively considered, the analysis dimensionality is richer, the semantic information is fuller, the semantic analysis is more complete, and the subsequent application analysis is more facilitated;

(2) The technical scheme of the invention has the advantage of practicability, mainly aims at the practical problems of sparse distribution of discourse-level event elements, element sharing and the like, comprehensively considers discourse-level semantic information from the actual extraction requirement, takes sentence-level extraction in the prior art as the basis, and takes discourse-level extraction as the key and difficult point, so that the technology has higher practicability.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flowchart illustrating a document level event element extraction principle according to an embodiment of the present invention;

FIG. 2 is a schematic representation of a text chapter in accordance with an embodiment of the present invention;

FIG. 3a is a schematic diagram illustrating the principle of event recognition and element extraction according to an embodiment of the present invention;

FIG. 3b is a schematic diagram illustrating a recognition principle of a central sentence and an event relation according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating a central sentence screening principle according to an embodiment of the present invention;

fig. 5 is a schematic diagram illustrating a principle of completing elements according to an embodiment of the present invention.

Detailed Description

All features disclosed in all embodiments in this specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.

In view of the technical problems existing in the background, the technical scheme of the invention provides a document-level event element extraction method, which improves the extraction range of event arguments to the document level on the basis of sentence-level event extraction, so that the applicability and the practicability of the method are further improved.

In a further inventive concept, the method comprises the steps of: firstly, sentence segmentation is carried out on text of a chapter to obtain sentence fragments, character coding, position coding and type coding are carried out on the sentence fragments to respectively obtain coding vectors C, L and S, three coding values of the sentence fragments are overlapped and sent to a pre-training BERT model to obtain a sentence vector V _i Splicing the sentence vectors to obtain an initial chapter vector V _initial Meanwhile, the sentence vectors are sent into an attention mechanism network, different sentences are endowed with different weight values, and the chapter vector V containing the implicit relation between the sentences is obtained _relation And the vectors are mixed and characterized to obtain the final text discourse representation V _text (ii) a Then event sentence identification and element extraction, central sentence identification and event element and cross-sentence event relation extraction are carried out, sentence vectors are classified in a sequence labeling mode to finish the event sentence identification and element extraction, and chapter vectors are classified to finish the centerSentence identification and element extraction, and simultaneously obtaining event correlation; secondly, carrying out mathematical statistics on the correlation between the central sentences and the events, and finishing central sentence screening based on a competitive principle; and finally, completing the element completion of the central event based on the element completion model to obtain a complete document-level event element.

As shown in FIG. 1, according to the technical solution of the present invention, firstly, a text of a chapter is divided into sentences to obtain sentence fragments, the sentence fragments are subjected to character encoding, position encoding and type encoding to obtain encoding vectors C, L and S, respectively, the three encoding values of the sentence fragments are overlapped and sent to a pre-training BERT model to obtain a sentence vector V _i Splicing the sentence vectors to obtain an initial chapter vector V _initial Meanwhile, the sentence vector is sent to an attention system network, different sentences are endowed with different weight values, and a chapter vector V containing implicit relations among the sentences is obtained _relation And the vectors of the two are mixed and characterized to obtain a final text discourse representation V _text (ii) a Then, carrying out event sentence identification and element extraction, central sentence identification and event element and cross-sentence event relation extraction, classifying sentence vectors in a sequence labeling mode to complete the event sentence identification and element extraction, classifying chapter vectors to complete the central sentence identification and element extraction, and simultaneously obtaining event correlation relations; secondly, carrying out mathematical statistics on the correlation between the central sentences and the events, and finishing central sentence screening based on a competitive principle; and finally, completing the element completion of the central event based on the element completion model to obtain complete document-level event elements.

As shown in fig. 2, text chapters are selected, and the chapters are first sentence-wise processed to form sentence fragments; then character coding is carried out on characters in each sentence fragment based on a character table, meanwhile, position coding is carried out on the position difference of the characters, a type code is given to the current sentence, three codes are overlapped to obtain a sentence-level initialization vector, the initialization vector of each sentence is input into a Bert pre-training model to respectively obtain a vector code value of each sentence, then, the vector values of all the sentences are spliced (Concat) to obtain an initial chapter-level vector code value of the current text; meanwhile, the initialization vector of each sentence is input into a multilayer attention network (the default is set to be 3 layers, the initialization vector can be expanded or reduced according to the actual situation, the minimum is 1 layer), the weight magnitude relation among different sentence vectors is obtained, a new chapter level initialization vector is formed, the sentence level initialization vector and the chapter level initialization vector are overlapped, and finally the whole text chapter vector representation containing the self semantics of the sentence and the context semantics of the sentence is obtained.

As shown in fig. 3a and fig. 3b, based on the text chapter vector representation of fig. 2, event sentence identification, center sentence identification, event element extraction, and inter-event relationship extraction are performed. The event sentence identification and the event element extraction are completed by carrying out sequence marking through basic BERT sentence vector characteristics, the event sentence marking is represented by 0 and 1, 0 represents a non-event sentence, 1 represents an event sentence, the event element marking is represented by B-I-O, B represents an element initial bit, I represents other bits of an element, O represents non-element bits, subscripts respectively represent a subject, an object, time, place and a trigger word by sub, obj, tim, loc and tri, the event sentence identification is realized by carrying out full-connection linear change on [ CLS ] vectors in sentence vectors, then whether the event sentences are the event sentences is detected through softmax binary classification, the element extraction is realized by carrying out full-connection linear change on each character bit vector in the sentence vectors, and detecting the flag bits of the event sentences through softmax binary classification, and finally, whether the current sentences are the event sentences and specific event element information are obtained; the identification and the extraction of the relationship between the central sentences are completed by carrying out sequence labeling on the text characteristics of chapter levels at the high level, the central sentences are still labeled by adopting 0 and 1, the event relationship is labeled by adopting 0,1,2 and 3 (the default is 3 typical relationships which can be expanded or reduced according to actual conditions and at least comprises 1 relationship), 0 represents no relationship, 1 represents a sequential relationship, 2 represents an association relationship and 3 represents a causal relationship, the identification of the central sentences is to carry out a full-connection linear change on each [ CLS ] vector in the chapter vectors and detect whether the central sentences are through softmax two-classification, the extraction of the relationship between the events is to carry out full-connection linear change on the [ CLS ] vectors in the chapter vectors and detect the affiliated relationship through softmax four-classification.

As shown in fig. 4, each central sentence is selected in a traversal manner, the number of event sentences having an implicit relationship with the central sentence is counted through a mathematical method, through a competitive principle, that is, the more implicit relationships are, the more prominent the central thought is, the central sentence having the most implicit relationship is selected as the only central sentence of the text, if all the central sentences do not have the implicit relationship or the relationship numbers of the central sentences are consistent, the central sentence positioned at the front is selected as the only central sentence by default, and if the number of the implicit relationship pieces is the same, the central sentences having the most implicit relationship are selected as the only central sentence of the text by default in a sorting manner according to the priority including causal relationship, associative relationship and sequential relationship.

As shown in fig. 5, firstly, an event element general template is defined, which mainly includes a subject, an object, time, a place and a trigger word 5 general argument roles, and is not limited to a certain event type, an event instance sample of (subject) at (place) (trigger word) (object) "is constructed, and the event relationship of fig. 3 and the central sentence screening of fig. 4 are combined, the finally confirmed central sentence element is subjected to element reconfiguration according to the general template sample, and simultaneously, the missing position supplement is performed by using the element of the event sentence associated with the central sentence to form positive and negative samples, then the positive and negative samples are respectively spliced with the original text to form a new sample, and then chapter-level event classification is performed, firstly text chapter representation is performed to obtain chapter-level features, then probability of 0 and 1 is obtained through softmax binary classification, 1 represents that the sample is reasonable, element supplement is correct, 0 represents that the sample is unreasonable, and element supplement is wrong, wherein the chapter-level event classification model is obtained by pre-training a large number of positive and negative samples constructed in advance on the basis of the text representation model of fig. 2.

It should be noted that the following embodiments can be combined and/or expanded, replaced in any way that is logical in any way from the above detailed description, such as the technical principles disclosed, the technical features disclosed or the technical features implicitly disclosed, etc., within the scope of protection defined by the claims of the present invention.

Example 1

A document-level event element extraction method comprises the following steps:

s1, sentence segmentation is carried out on a text of a chapter to obtain sentence fragments, character coding, position coding and type coding are carried out on the sentence fragments to obtain coding vectors C, L and S respectively; the coding vectors C, L and S are overlapped to obtain sentence-level initialization vectors, and the sentence-level initialization vectors are sent to a pre-training model to obtain a sentence vector V _i The pre-training model comprises a Bert pre-training model; vector each sentence into V _i Splicing to obtain initial chapter vector V _initial (ii) a Meanwhile, the sentence-level initialization vector is sent to an attention system network, different sentences are endowed with different weight values, and a chapter vector V containing implicit relations among the sentences is obtained _relati on; will V _initial And V _relation Performing superposition mixing characterization to obtain final text chapter representation V _text ；

Example 2

On the basis of embodiment 1, in step S1, the character encoding, position encoding, and type encoding of the sentence fragments specifically include: and performing character coding on characters in each sentence fragment based on the character table, performing position coding on the position difference of the characters, and endowing the current sentence with a type code.

Example 3

On the basis of the embodiment 1, in step S2, the step of classifying the sentence vectors by the sequence labeling manner to complete the event sentence recognition and the element extraction specifically includes the steps of: the event sentence marks are represented by 0 and 1, 0 represents a non-event sentence, 1 represents an event sentence, the event element marks are represented by B-I-O, B represents an element start bit, I represents other bits of the element, O represents a non-element bit, subscripts respectively represent a subject, an object, time, a place and a trigger word by sub, obj, tim, loc and tri, the event sentence identification is that whether the event sentence is detected by performing full-connection linear change on a [ CLS ] vector in a sentence vector and then performing softmax binary classification; element extraction is realized by performing full-connection linear change on each character bit vector in the sentence vector, detecting the zone bit of each character bit vector through softmax secondary classification, and finally obtaining whether the current sentence is an event sentence or not and specific event element information.

Example 4

On the basis of the embodiment 1, in the step S2, the classifying the discourse vectors to complete the central sentence recognition and the element extraction, and obtain the event correlation at the same time, specifically includes the steps of: the central sentence is labeled by adopting 0 and 1, the event relation is labeled by 0,1,2 and 3, 0 represents no relation, 1 represents a sequential relation, 2 represents an association relation, 3 represents a causal relation, the central sentence identification detects whether the central sentence is the target sentence or not by carrying out full-connection linear change on each [ CLS ] vector in the chapter vectors and carrying out softmax secondary classification;

and extracting the relationship between events by carrying out full-connection linear change on [ CLS ] vectors in the chapter vectors in pairs and detecting the affiliated relationship of the vectors by softmax four-classification.

Example 5

On the basis of embodiment 1, in step S3, the method includes the sub-steps of:

Example 6

On the basis of embodiment 1, in step S4, the completing center event element completion based on the element completion model specifically includes the following substeps:

s42, splicing the positive and negative samples with the original text respectively to form new samples, and then carrying out chapter-level event classification.

Example 7

On the basis of the embodiment 6, in step S42, the sorting of discourse-level events specifically includes the following sub-steps:

s422, probability of 0 and 1 is obtained through softmax binary classification, wherein 1 represents reasonable samples, element supplement is correct, 0 represents unreasonable samples and element supplement is wrong, and the chapter-level event classification is obtained through pre-training positive and negative samples constructed in advance on the basis of text chapter representation.

Example 8

On the basis of example 1, the text discourse represents V _text The entire text chapter vector representation including the semantics of the sentence itself and the semantics of the sentence context.

Example 9

A computer device comprising a processor and a memory, the memory having stored therein a computer program which, when loaded by the processor, performs the method of any of embodiments 1 to 8.

Example 10

A readable storage medium, in which a computer program is stored, which computer program is loaded by a processor and executes a method according to any of embodiments 1-8.

The units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

According to an aspect of an embodiment of the present invention, there is provided a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the method provided in the above-mentioned various alternative implementation modes.

As another aspect, an embodiment of the present invention further provides a computer-readable medium, which may be included in the electronic device described in the above embodiment; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs, which when executed by one of the electronic devices, cause the electronic device to implement the method described in the above embodiments.

The parts not involved in the present invention are the same as or can be implemented using the prior art.

The above-described embodiment is only one embodiment of the present invention, and it will be apparent to those skilled in the art that various modifications and variations can be easily made based on the application and principle of the present invention disclosed in the present application, and the present invention is not limited to the method described in the above-described embodiment of the present invention, so that the above-described embodiment is only preferred, and not restrictive.

In addition to the foregoing examples, those skilled in the art, having the benefit of this disclosure, may derive other embodiments from the teachings of the foregoing disclosure or from modifications and variations utilizing knowledge or skill of the related art, which may be interchanged or substituted for features of various embodiments, and such modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the present invention as set forth in the following claims.

Claims

1. A document-level event element extraction method is characterized by comprising the following steps:

s1, sentence segmentation is carried out on a text of a chapter to obtain sentence fragments, character coding, position coding and type coding are carried out on the sentence fragments to obtain coding vectors C, L and S respectively; the code vectors C, L and S are overlapped to obtain sentence-level initialization vectors, and the sentence-level initialization vectors are sent to a pre-training model to obtain a sentence vector V _i The pre-training model comprises a Bert pre-training model; each sentence vector V _i Splicing to obtain an initial chapter vector V _initial (ii) a Meanwhile, the sentence-level initialization vector is sent to an attention mechanism network, different sentences are endowed with different weight values, and a chapter vector V containing implicit relations between the sentences is obtained _relation (ii) a Will V _initial And V _relation Performing superposition mixing characterization to obtain final text chapter representation V _text ；

2. The document-level event element extraction method according to claim 1, wherein in step S1, the character encoding, position encoding and type encoding of the sentence fragments specifically comprises: and performing character coding on characters in each sentence fragment based on the character table, performing position coding on the position difference of the characters, and endowing the current sentence with a type code.

3. The method for extracting document-level event elements according to claim 1, wherein in step S2, the step of classifying sentence vectors by sequence labeling to complete the event sentence recognition and element extraction comprises the following steps: the event sentence marks are represented by 0 and 1, 0 represents a non-event sentence, 1 represents an event sentence, the event element marks are represented by B-I-O, B represents an element start bit, I represents other bits of the element, O represents a non-element bit, subscripts respectively represent a subject, an object, time, a place and a trigger word by sub, obj, tim, loc and tri, the event sentence identification is that whether the event sentence is detected by performing full-connection linear change on a [ CLS ] vector in a sentence vector and then performing softmax binary classification; element extraction is realized by performing full-connection linear change on each character bit vector in the sentence vector, detecting the flag bit of each character bit vector through softmax two-classification, and finally obtaining whether the current sentence is an event sentence or not and specific event element information.

4. The document-level event element extraction method of claim 1, wherein in step S2, the classifying of the chapter vectors to complete the central sentence recognition and element extraction and obtain the event correlation, specifically comprises the steps of: the central sentence is labeled by 0 and 1, the event relation is labeled by 0,1,2 and 3, 0 represents no relation, 1 represents a sequential bearing relation, 2 represents an association relation, 3 represents a causal relation, the central sentence identification detects whether the central sentence is the central sentence or not by performing a full-connection linear change on each [ CLS ] vector in the chapter vectors and classifying by softmax;

5. The document-level event element extraction method according to claim 1, comprising, in step S3, the sub-steps of:

6. The document-level event element extraction method according to claim 1, wherein in step S4, completing central event element completion based on the element completion model specifically comprises the sub-steps of:

s41, defining an event element universal template, wherein the event element universal template comprises five universal argument roles of a subject, an object, time, a place and a trigger word; constructing an event example sample, carrying out element reconfiguration on finally confirmed central sentence elements according to a universal template sample, and simultaneously carrying out missing position supplement by using the elements of the central sentence correlated event sentences to form positive and negative samples;

7. The document-level event element extraction method as claimed in claim 6, wherein in step S42, the chapter-level event classification specifically includes the sub-steps of:

8. The document-level event element extraction method of claim 1, wherein the text discourse representation V _text The entire text chapter vector representation including the semantics of the sentence itself and the semantics of the sentence context.

9. A computer arrangement, characterized in that the computer arrangement comprises a processor and a memory, in which a computer program is stored which, when loaded by the processor, carries out the method according to any one of claims 1-8.

10. A readable storage medium, in which a computer program is stored which is loaded by a processor and which performs the method according to any one of claims 1 to 8.