CN114519344A

CN114519344A - Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method

Info

Publication number: CN114519344A
Application number: CN202210087670.6A
Authority: CN
Inventors: 庄越挺; 邵健; 吕梦瑶; 宗畅
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2022-05-20

Abstract

The invention discloses a discourse level multi-event extraction method based on argument sub-graph prompt generation and guidance. The invention uses the chapter-level long text encoder to obtain the complete text characteristics, and can simultaneously utilize chapter-level information and sentence-level information. The multi-event referring and positioning are realized through the event sketch generated by the extraction of the multi-argument relation, and the argument classification is realized by filling the event slots by using a pre-training model method based on a prompt paradigm, so that the accuracy of multi-event extraction is improved. The method does not need to use trigger words, and reduces the labeling burden on the data set.

Description

Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method

Technical Field

The invention relates to the technical field of natural language processing, in particular to a discourse level multi-event extraction method based on argument sub-graph prompt generation and guidance.

Background

With the rapid development of internet technology, mass data are rushed into the lives of people. In order to rapidly process large-scale data and mine information with potential value in the data, the demand of people for information extraction technology is increasing. Event extraction is an important task in the field of information extraction, and aims to detect the occurrence of an event from an unstructured natural language text, judge the type of the event, extract important elements participating in the event, and present the result in a structured manner. The event extraction has wide application value, and on one hand, the event extraction can assist in providing structured multivariate relation information and brings performance improvement for machine reading understanding and knowledge graph construction. On the other hand, event extraction can help people understand the operation process of events and assist analysis and decision in the field of practical application.

Currently, most of research on event extraction utilizes a deep learning method to model event extraction as a sequence labeling problem. Firstly, extracting a trigger word, and if the trigger word is contained, considering that an event occurs. And then extracts arguments from the text. And finally, judging whether the trigger word is related to the argument or not so as to determine whether the argument belongs to the event referred by the trigger word or not. However, such a method has the following disadvantages:

1. only the information at the sentence level is concerned, and the information at the chapter level is ignored. In an actual scene, events have the characteristic of argument dispersion, and texts describing the events in one document are generally distributed in a plurality of sentences, so that chapter-level information needs to be considered to obtain a complete extraction result.

2. The accuracy rate is not high for multiple event extraction. In a document in the actual field, a plurality of events are often distributed in a staggered manner, the existing method relies on a trigger word to refer to an event to extract the plurality of events, the trigger word of a real scene is often difficult to judge, and the phenomena that one event comprises a plurality of trigger words, one trigger word corresponds to a plurality of event types, no obvious trigger word exists and the like exist. Therefore, the method relying on the trigger word may cause redundancy or omission of the extraction result, resulting in poor multi-event extraction effect.

3. The over-reliance on the trigger word brings burden to data annotation. The existing method usually takes a trigger word as a medium, but the trigger word is only an intermediate result of event extraction, is not necessary, and has very high labeling difficulty, thus increasing the burden of manually constructing a data set.

In summary, the existing technical solutions have the disadvantages of ignoring chapter-level information, low accuracy of multi-event extraction, and excessive dependence on trigger words.

Disclosure of Invention

Aiming at the defects of the prior method, the invention provides a novel discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance. The method uses a chapter-level long text encoder, and can simultaneously utilize chapter-level information and sentence-level information. The multi-event referring and positioning are realized through the event sketch generated by the extraction of the multi-argument relation, and the argument classification is realized by filling the event slots by using a pre-training model method based on a prompt paradigm, so that the accuracy of multi-event extraction is improved. The method does not need to use trigger words, and reduces the labeling burden on the data set.

The technical scheme of the invention is as follows:

a discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance comprises the following steps:

s1: extracting candidate arguments from the input text;

s2: extracting an event sketch contained in the input text;

s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;

s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iterating and repeating the step (S3) to obtain a corrected final event record.

As a preferred embodiment of the present invention, the step S1 includes the following steps:

s11: in the training stage, the input event text is processed into an input form conforming to a Longformer model and is represented as a text sequence

Labeling the label of each element in a BIO labeling mode;

s12: for the text sequence D input in S11, after being encoded by an encoder based on a Longformer pre-training model, an intermediate vector HF is obtained:

s13: the intermediate vector HF obtained at S12 is passed through the full connection layer FC to obtain the final element representation vector ZF:

s14: computing per-element output tag/through softmax layer_iA posteriori probability P (l)_i|zf_i) Wherein W is_sAnd b_sAre trainable parameters:

P(l_i|zf_i)＝softmax(W_szf_i+b_s)

s15: outputting the label class with the highest probability for each element in the sequence

S16: analyzing a label sequence of the text sequence element obtained in the step S15 for a 'BIO' label to obtain all candidate argument instances contained in the document, merging the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities; and representing the obtained candidate argument entity set as candidate arguments

e_iAll mentioned examples of (A) are indicated as

As a preferred embodiment of the present invention, the step S2 includes the following steps:

s21: judging pairwise relations between the candidate arguments, and modeling relation judgment into a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category 'threshold category';

s22: constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein; the global candidate argument relation graph is represented as an undirected graph G ═ (V, E), where V represents the set of vertices,

each point v_iIs a candidate argument that has been extracted; e represents a collection of edges, each (v)_i,v_j)∈E,(i,j≤N_eI ≠ j) represents v_iAnd v_jThere is a relationship between them, and the category of the relationship is R (v)_i,v_j)；

S23: extracting subgraphs from the global argument relation graph;

s24: constructing an event sketch according to the extracted subgraphs, and representing all the obtained event sketches as

Wherein each event sketch s_iIs t_iI.e. the type of edge in the subgraph; event sketch s_iThe argument post-selection set contained in (A) is expressed as

I.e. the set of all vertices in the candidate argument subgraph.

As a preferred embodiment of the present invention, the step S3 includes the following steps:

s31: for event sketch s_iAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In [ argument role 1 ]]Is [ ans _ slot _1 ]]And [ argument role 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketch_iAnd is used to indicate the argument character 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers;

s32: constructing event sketch modelA plate; for candidate arguments contained in the event sketch, it is converted into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ argument candidates 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch

“[RD]"is a specific delimiter, consisting of one or more predefined identifiers;

s33: splicing the event prompt template in the S31 with the event sketch template obtained in the S32, adding a prefix "[ CLS ]", and spacing by using "[ SEP ]", so as to form an event subgraph prompt;

s34: filling the event slot.

As a preferred embodiment of the present invention, the step S4 includes the following steps:

s41: setting iterative correction times cnt;

s42: converting event arguments contained in the event record obtained in the step S3 into a new event sketch;

s43: and inputting the new event sketch obtained in the step S42 into the step S3, re-filling the event slots, and repeating iterative filling cnt for times, wherein the final result is the structured result of event extraction.

Compared with the prior art, the invention has the following beneficial effects:

(1) the invention uses the pretraining model based on Longformer which can process the ultra-long text as the text encoder, thereby directly utilizing chapter-level information, realizing the circulation of the global information and the local information of the document and improving the integrity of the event extraction result.

(2) The invention constructs a frame method and improves the chapter-level multi-event extraction effect by applying various optimization modes. Firstly, an adaptive threshold method and a subgraph extraction method based on pedigree filtering are applied to construct an event sketch to refer to multiple events and obtain a preliminary event record. And then converting the event extraction task into a slot filling task, and introducing a large amount of background knowledge contained in a pre-training model by applying a pre-training model method based on a prompt paradigm so as to improve the accuracy of event slot filling. And finally, the result is iteratively corrected for multiple times, so that the overall extraction effect is improved.

(3) The invention uses a method of firstly judging the binary relation between argument candidates and then extracting the multivariate subgraph on the global candidate argument relation graph to finally obtain the event sketch with argument multivariate relation as the reference of multiple events. The extraction of multiple events can be realized under the condition of not marking the trigger words, the requirement for marking the data set is reduced, and the labor and time burden during the construction of the data set is reduced.

Drawings

FIG. 1 is a flowchart of a discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance.

Detailed Description

In order to more clearly illustrate the technical method provided by the invention, taking the chfinnann public event data set as an example, the implementation steps of the discourse-level multi-event extraction method based on the generation and guidance of the argument subgraph prompt are specifically illustrated.

As shown in fig. 1, the method of the present invention comprises the following four steps:

s1: extracting candidate arguments from the input text;

s2: extracting an event sketch contained in the input text;

s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iteratively repeating the step (S3) to obtain a corrected final event record.

Preferably, the step of S1 is as follows:

s11, processing the input event text into an input form conforming to a Longformer model, wherein the input form can be represented as a text sequence

The label of each element is labeled by the labeling mode of 'BIO'.

S12, for the input text sequence, after the input text sequence is coded by a coder based on a Longformer pre-training model, obtaining an intermediate vector HF:

s13, obtaining a final element representation vector ZF by the obtained intermediate vector HF through a full connection layer FC:

s14, calculating output label l of each element through softmax layer_iA posteriori probability P (l)_i|zf_i) Wherein W is_sAnd b_sAre trainable parameters:

P(l_i|zf_i)＝softmax(W_szf_i+b_s)

s15, outputting the label category with the maximum probability for each element in the sequence

And S16, analyzing the label sequence of the obtained text sequence elements by using a BIO label to obtain all candidate argument instances contained in the document, combining the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to the corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities. The final candidate argument entity set can be represented as a candidate argument

e_iAll mentioned examples of (A) can be represented as

Preferably, the step of S2 is as follows:

s21: judging the relation between every two candidate arguments, modeling the relation judgment as a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category of a threshold category, and is specifically shown as S211-S217:

s211, for the input text sequence

Encoding by using a Longformer encoder to obtain an intermediate vector HS:

s212, candidate argument e_iIncluding the reference to

Is represented by the sequence position of

Which represents the starting position of the device,

representing the end position. Aggregating to form the mentioned representation vectors using an average pooling

S213. through calculating candidate argument e_iAverage pooling values of all mentioned representation vectors, calculating the representation direction of each candidate argumentMeasurement of

S214, sequentially selecting two different candidate arguments e_iAnd e_jConverted into a hidden vector by a linear layer and a non-linear layer tanh

And

wherein W_iAnd W_jIs a trainable parameter;

s215: computing probability P of relation category r through bilinear mapping bilinear_rWhere σ denotes the softmax function, W_rAnd b_rAre trainable parameters:

s216, training by using the following adaptive dynamic threshold loss function:

L_total＝L₁+L₂

wherein the relationship set formed by the positive classes is C_TThe relationship set composed of the negative classes is C_N(ii) a The sign of the threshold class is Th; at L₁Where r belongs to the positive class, r' belongs to the positive class and the threshold class, P_rProbability of representing r class, P_r′Representing the probability of the r' category; at L₂In, r' belongs to the negative class and the threshold class, P_r′Probability of representing r' class, P_THRepresenting a probability of a threshold class; l is₁The probability of the loss-optimized positive class is greater than the threshold class, L₂The penalty is such that the penalty for the threshold class is greater than the negative class; ultimate loss L_totalIs L₁And L₂The sum of (a);

s217: when predicting, judging whether the probability of each relation category is larger than the threshold class of the sample prediction to obtain whether the candidate argument has the relation of the category, wherein r represents a certain category, Rel represents candidate argument e_iAnd e_jThe relationship between:

if P_r(e_i,e_j)>P_Th(e_i,e_j),then Rel(e_i,e_j)＝r

and S22, constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein. This graph may be represented as an undirected graph, G ═ (V, E), where V represents a set of vertices,

each point v_iIs a candidate argument that the model has extracted; e represents a collection of edges, each (v)_i,v_j)∈E,(i,j≤N_eI ≠ j) represents v_iAnd v_jThere is a relationship between them, and the category of the relationship is R (v)_i,v_j)。

S23: extracting subgraphs from the global argument relationship graph, as shown in S231-S234:

s231. findTo all full k-values of k in G, { c }₁,c₂,…,c_n}；

S232, each complete subgraph of k-cliques is defined as a new vertex, and when the number of the same original vertices contained between every two new vertices is larger than or equal to k-1, the two new vertices are endowed with an edge, so that a new graph G can be formed_newThen, the analysis will continue on the new graph;

s233, finding G_newAll complete subgraphs in (1);

and S234, all original vertexes contained in each complete subgraph form a subgraph, namely the candidate argument subgraph to be extracted finally.

And S24, constructing an event sketch according to the extracted subgraphs. All resulting event sketches can be represented as

Wherein each event sketch s_iIs t_iI.e. the type of edge in the subgraph; event sketch s_iThe argument post-selection set contained in (A) can be expressed as

I.e. the set of all vertices in the candidate argument subgraph.

Preferably, the step of S3 is as follows:

s31, for event sketch s_iAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In, [ argument role 1]Is [ ans _ slot _1 ]]And [ argument role 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketch_iAnd is used to indicate the argument role 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers, in this embodiment, each answer slot uses, for example," [ unused1][unused2]"patterns of these two identifiers, sequence number is incremented.

And S32, constructing an event sketch template. For eventsCandidate arguments contained in the sketch, which are translated into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ argument candidates 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch

“[RD]"is a specific separator, consisting of one or more predefined identifiers, which in this embodiment is used" [ unused80 ]]”。

And S33, splicing the event prompt template in the S31 and the event sketch template obtained in the S32, adding a prefix of 'CLS', and spacing by using 'SEP', so that an argument subpicture prompt can be formed.

S34: filling the event slot, wherein the specific steps are shown as S341-S349:

s341, splicing the argument subpicture prompt obtained in S33 with the text, and using "[ SEP ]]"As a spacer, the input text is represented as

S342. after longformer encoding, an intermediate vector HT is generated:

s343, extracting the obtained candidate argument

The positions of the event sketch and the original text part in the new text sequence are represented as

An interval. Obtaining a representation vector of candidate argument mentions by average pooling of intermediate vectors of all elements included in a candidate argument mention

S344, the expression vector of the candidate argument can be obtained by carrying out average pooling on all mentioned expression vectors of the candidate argument

S345. initialize a vector ht_NULLTo represent the case where the answer slot is not filled with any arguments. ht_NULLDimension of and

similarly, the vector parameters are continuously updated through training, so that the expression vector with the meaning of "NULL" can be learned. Will ht_NULLThe answer candidate can be obtained after splicing argument candidate expression vector, and can be expressed as

Wherein N is_eIs the number of candidate arguments, when q is equal to N_eAt the time of +1, the reaction solution,

if not, then,

s346, by performing average pooling on the intermediate vectors of all special placeholders forming a certain answer slot, a representation vector of the answer slot position can be obtained

Wherein

Representing event-based sketches s_iThe j-th answer slot of the sample, the calculation formula is as follows:

s347. for each answer slot, in all answer candidate vectors

The most suitable one is selected for filling, and the probability of each candidate answer is calculated according to the following formula, wherein W_k、W_pAnd W_uIs a trainable parameter, σ represents the softmax function:

s348, performing loss function calculation through cross entropy,

is a true tag:

s349, at the time of prediction, for a certain point

Selecting one of all answer candidates with the highest probability:

the theta answer candidate obtained by the formula is the answer that should be filled in the event slot. When the answer is a candidate argument, the argument role slot can be directly filled, and if the answer is 'NULL', the argument role slot is indicated to be kept vacant.

Preferably, the step of S4 is as follows:

s41, setting iterative correction times cnt;

s42, converting the event arguments contained in the obtained event records into a new event sketch;

and S43, filling the event slot again for the obtained new event sketch, and repeatedly iterating and filling the cnt for times, wherein the final result is the structured result of event extraction.

The chfinnann dataset contains five classes of events, respectively: stock Freeze (Equity Freeze, EF), stock Repurchase (ER), stock cutback (EU), stock overtightness (EO), and stock Pledge (EP). After the above steps of the present invention, the specific experimental results of the present method (DMEE-ASP) on this data set are shown in tables 1 and 2 below:

TABLE 1 event type Classification results

	DMEE-ASP
		Fraction frozen-F1 value	100.0
Stock repurchase-F1 value	100.0
		Stock reduction-F1 value	99.3
Stock increase-F1 value	99.5
		The value of the fraction pledge-F1	99.9

Table 2 event argument extraction experimental results

As can be seen from tables 1 and 2, the invention can complete the event type classification and the event argument extraction of the chapter-level multi-event extraction task without trigger words, and the experimental result is good, wherein the F1 value in the event type classification reaches more than 99, and the total F1 value in the event argument extraction reaches the best level at present. The method effectively improves the accuracy of the event extraction result, and has certain universality and superiority.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the appended claims.

Claims

1. A discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance is characterized by comprising the following steps:

s1: extracting candidate arguments from the input text;

s2: extracting an event sketch contained in the input text;

2. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S1 comprises the following steps:

Labeling the label of each element in a BIO labeling mode;

s14: passing through softmax layerTo calculate per-element output label/_iA posteriori probability P (l)_i|zf_i) Wherein W is_sAnd b_sAre trainable parameters:

P(l_i|zf_i)＝softmax(W_szf_i+b_s)

e_iAll mentioned examples of (A) are represented by

3. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S2 comprises the following steps:

s22: constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein; all-purposeThe graph of the office candidate argument relation is represented as an undirected graph G ═ V, E, where V represents the set of vertices,

(|V|＝N_e) Each point v_iIs a candidate argument that has been extracted; e represents a collection of edges, each (v)_i，v_j)∈E，(i，j≤N_eI ≠ j) represents v_iAnd v_jThere is a relationship between them, and the category of the relationship is R (v)_i，v_j)；

S23: extracting subgraphs from the global argument relation graph;

I.e. the set of all vertices in the candidate argument subgraph.

4. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 3, wherein said step S21 comprises the following steps:

s211: for an input text sequence

Encoding by using a Longformer encoder to obtain an intermediate vector HS:

s212: candidate argument e_iIncluding the reference to

Is represented by the sequence position of

Which represents the starting position of the device,

represents an end position; aggregating to form the mentioned representation vectors using means of average pooling (Mean)

S213: by computing candidate arguments e_iAverage pooling values of all mentioned representative vectors, calculating a representative vector for each candidate argument

S214: two different candidate arguments e are selected in turn_iAnd e_jConverted into a hidden vector by a linear layer and a non-linear layer tanh

And

wherein W_iAnd W_jIs a trainable parameter;

s215: computing probability P of relation category r by bilinear mapping bilinear_rWhere σ denotes the softmax function, W_rAnd b_rAre trainable parameters:

s216: training is performed using an adaptive dynamic threshold loss function as follows:

L_total＝L₁+L₂

wherein the relationship set formed by the positive classes is C_TThe relationship set composed of the negative classes is C_N(ii) a The sign of the threshold class is Th; at L₁Where r belongs to the positive class, r' belongs to the positive class and the threshold class, P_rProbability of representing r class, P_r′Probability of representing r' category; at L₂In, r' belongs to the negative class and the threshold class, P_r′Probability of representing r' class, P_THRepresenting a probability of a threshold class; l is₁Probability of loss optimization positive class is greater thanClass of thresholds, L₂The loss is such that the loss of the threshold class is greater than the negative class; ultimate loss L_totalIs L₁And L₂The sum of (a);

if P_r(e_i，e_j)＞P_Th(e_i，e_j)，then Rel(e_i，e_j)＝r。

5. the method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 3, wherein said step S23 comprises the following steps:

s231: finding all k-size complete subgraphs { c }in G₁，c₂，...，c_n}；

S232: defining each complete subgraph of k-cliques as a new vertex, and giving one edge to each two new vertices when the number of the same original vertices contained between the two new vertices is greater than or equal to k-1, thereby forming a new graph G_newThen, the analysis will continue on the new graph;

s233: find G_newAll complete subgraphs in (1);

s234: all the original vertices contained in each complete subgraph constitute a subgraph, i.e. the candidate argument subgraph to be extracted finally.

6. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S3 comprises the following steps:

s31: for event sketch s_iAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In, [ argument role 1]Is [ ans _ slot _1 ]]Angle of argumentColor 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketch_iAnd is used to indicate the argument character 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers;

s32: constructing an event sketch template; for candidate arguments contained in the event sketch, it is converted into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ candidate argument 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch

s34: filling the event slot.

7. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 6, wherein said step S34 comprises the following steps:

s341: splicing the argument subpicture prompt obtained in S33 with the text, and using "[ SEP ]]"As a spacer, the input text is represented as

S342: after longformer coding, an intermediate vector HT is generated:

s343: candidate argument mentioned in S1

An interval; obtaining a representation vector of candidate argument mentions by average pooling of intermediate vectors of all elements included in a candidate argument mention

S344: obtaining the expression vector of the candidate argument by averagely pooling all mentioned expression vectors of the candidate argument

S345: initializing a vector ht_NULLTo represent the case where the answer slot is not filled with any argument; ht_NULLDimension of and

similarly, the vector parameters are continuously updated through training, so that a representation vector with the meaning of NULL is learned; will ht_NULLSplicing argument candidate expression vectors to obtain answer candidates represented as

Wherein N is_eIs the number of candidate arguments, when q is equal to N_e+1When the utility model is used, the water is discharged,

if not, then,

s346: obtaining the expression vector of the answer slot position by performing average pooling on the intermediate vectors of all special placeholders forming a certain answer slot

Wherein

Representing event-based sketches s_iThe j-th answer slot of the sample is calculated as follows:

s347: for each answer slot, all answer candidate vectors

s348: the computation of the loss function is performed by cross entropy,

is a true tag:

s349: at the time of prediction, for a certain

Selecting one of all answer candidates with the highest probability:

the theta answer candidate obtained by the formula is the answer which should be filled in the event slot; when the answer is a candidate argument, the argument role slot is directly filled, and if the answer is 'NULL', the argument role slot is indicated to be kept vacant.

8. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S4 comprises the following steps:

s41: setting iterative correction times cnt;