CN114519344A - Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method - Google Patents

Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method Download PDF

Info

Publication number
CN114519344A
CN114519344A CN202210087670.6A CN202210087670A CN114519344A CN 114519344 A CN114519344 A CN 114519344A CN 202210087670 A CN202210087670 A CN 202210087670A CN 114519344 A CN114519344 A CN 114519344A
Authority
CN
China
Prior art keywords
event
argument
candidate
sketch
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210087670.6A
Other languages
Chinese (zh)
Inventor
庄越挺
邵健
吕梦瑶
宗畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202210087670.6A priority Critical patent/CN114519344A/en
Publication of CN114519344A publication Critical patent/CN114519344A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a discourse level multi-event extraction method based on argument sub-graph prompt generation and guidance. The invention uses the chapter-level long text encoder to obtain the complete text characteristics, and can simultaneously utilize chapter-level information and sentence-level information. The multi-event referring and positioning are realized through the event sketch generated by the extraction of the multi-argument relation, and the argument classification is realized by filling the event slots by using a pre-training model method based on a prompt paradigm, so that the accuracy of multi-event extraction is improved. The method does not need to use trigger words, and reduces the labeling burden on the data set.

Description

Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method
Technical Field
The invention relates to the technical field of natural language processing, in particular to a discourse level multi-event extraction method based on argument sub-graph prompt generation and guidance.
Background
With the rapid development of internet technology, mass data are rushed into the lives of people. In order to rapidly process large-scale data and mine information with potential value in the data, the demand of people for information extraction technology is increasing. Event extraction is an important task in the field of information extraction, and aims to detect the occurrence of an event from an unstructured natural language text, judge the type of the event, extract important elements participating in the event, and present the result in a structured manner. The event extraction has wide application value, and on one hand, the event extraction can assist in providing structured multivariate relation information and brings performance improvement for machine reading understanding and knowledge graph construction. On the other hand, event extraction can help people understand the operation process of events and assist analysis and decision in the field of practical application.
Currently, most of research on event extraction utilizes a deep learning method to model event extraction as a sequence labeling problem. Firstly, extracting a trigger word, and if the trigger word is contained, considering that an event occurs. And then extracts arguments from the text. And finally, judging whether the trigger word is related to the argument or not so as to determine whether the argument belongs to the event referred by the trigger word or not. However, such a method has the following disadvantages:
1. only the information at the sentence level is concerned, and the information at the chapter level is ignored. In an actual scene, events have the characteristic of argument dispersion, and texts describing the events in one document are generally distributed in a plurality of sentences, so that chapter-level information needs to be considered to obtain a complete extraction result.
2. The accuracy rate is not high for multiple event extraction. In a document in the actual field, a plurality of events are often distributed in a staggered manner, the existing method relies on a trigger word to refer to an event to extract the plurality of events, the trigger word of a real scene is often difficult to judge, and the phenomena that one event comprises a plurality of trigger words, one trigger word corresponds to a plurality of event types, no obvious trigger word exists and the like exist. Therefore, the method relying on the trigger word may cause redundancy or omission of the extraction result, resulting in poor multi-event extraction effect.
3. The over-reliance on the trigger word brings burden to data annotation. The existing method usually takes a trigger word as a medium, but the trigger word is only an intermediate result of event extraction, is not necessary, and has very high labeling difficulty, thus increasing the burden of manually constructing a data set.
In summary, the existing technical solutions have the disadvantages of ignoring chapter-level information, low accuracy of multi-event extraction, and excessive dependence on trigger words.
Disclosure of Invention
Aiming at the defects of the prior method, the invention provides a novel discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance. The method uses a chapter-level long text encoder, and can simultaneously utilize chapter-level information and sentence-level information. The multi-event referring and positioning are realized through the event sketch generated by the extraction of the multi-argument relation, and the argument classification is realized by filling the event slots by using a pre-training model method based on a prompt paradigm, so that the accuracy of multi-event extraction is improved. The method does not need to use trigger words, and reduces the labeling burden on the data set.
The technical scheme of the invention is as follows:
a discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance comprises the following steps:
s1: extracting candidate arguments from the input text;
s2: extracting an event sketch contained in the input text;
s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;
s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iterating and repeating the step (S3) to obtain a corrected final event record.
As a preferred embodiment of the present invention, the step S1 includes the following steps:
s11: in the training stage, the input event text is processed into an input form conforming to a Longformer model and is represented as a text sequence
Figure BDA0003487675070000021
Labeling the label of each element in a BIO labeling mode;
s12: for the text sequence D input in S11, after being encoded by an encoder based on a Longformer pre-training model, an intermediate vector HF is obtained:
Figure BDA0003487675070000022
s13: the intermediate vector HF obtained at S12 is passed through the full connection layer FC to obtain the final element representation vector ZF:
Figure BDA0003487675070000031
s14: computing per-element output tag/through softmax layeriA posteriori probability P (l)i|zfi) Wherein W issAnd bsAre trainable parameters:
P(li|zfi)=softmax(Wszfi+bs)
s15: outputting the label class with the highest probability for each element in the sequence
Figure BDA0003487675070000032
Figure BDA0003487675070000033
S16: analyzing a label sequence of the text sequence element obtained in the step S15 for a 'BIO' label to obtain all candidate argument instances contained in the document, merging the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities; and representing the obtained candidate argument entity set as candidate arguments
Figure BDA0003487675070000034
eiAll mentioned examples of (A) are indicated as
Figure BDA0003487675070000035
As a preferred embodiment of the present invention, the step S2 includes the following steps:
s21: judging pairwise relations between the candidate arguments, and modeling relation judgment into a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category 'threshold category';
s22: constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein; the global candidate argument relation graph is represented as an undirected graph G ═ (V, E), where V represents the set of vertices,
Figure BDA0003487675070000036
each point viIs a candidate argument that has been extracted; e represents a collection of edges, each (v)i,vj)∈E,(i,j≤NeI ≠ j) represents viAnd vjThere is a relationship between them, and the category of the relationship is R (v)i,vj);
S23: extracting subgraphs from the global argument relation graph;
s24: constructing an event sketch according to the extracted subgraphs, and representing all the obtained event sketches as
Figure BDA0003487675070000037
Wherein each event sketch siIs tiI.e. the type of edge in the subgraph; event sketch siThe argument post-selection set contained in (A) is expressed as
Figure BDA0003487675070000038
I.e. the set of all vertices in the candidate argument subgraph.
As a preferred embodiment of the present invention, the step S3 includes the following steps:
s31: for event sketch siAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In [ argument role 1 ]]Is [ ans _ slot _1 ]]And [ argument role 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketchiAnd is used to indicate the argument character 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers;
s32: constructing event sketch modelA plate; for candidate arguments contained in the event sketch, it is converted into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ argument candidates 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch
Figure BDA0003487675070000041
“[RD]"is a specific delimiter, consisting of one or more predefined identifiers;
s33: splicing the event prompt template in the S31 with the event sketch template obtained in the S32, adding a prefix "[ CLS ]", and spacing by using "[ SEP ]", so as to form an event subgraph prompt;
s34: filling the event slot.
As a preferred embodiment of the present invention, the step S4 includes the following steps:
s41: setting iterative correction times cnt;
s42: converting event arguments contained in the event record obtained in the step S3 into a new event sketch;
s43: and inputting the new event sketch obtained in the step S42 into the step S3, re-filling the event slots, and repeating iterative filling cnt for times, wherein the final result is the structured result of event extraction.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention uses the pretraining model based on Longformer which can process the ultra-long text as the text encoder, thereby directly utilizing chapter-level information, realizing the circulation of the global information and the local information of the document and improving the integrity of the event extraction result.
(2) The invention constructs a frame method and improves the chapter-level multi-event extraction effect by applying various optimization modes. Firstly, an adaptive threshold method and a subgraph extraction method based on pedigree filtering are applied to construct an event sketch to refer to multiple events and obtain a preliminary event record. And then converting the event extraction task into a slot filling task, and introducing a large amount of background knowledge contained in a pre-training model by applying a pre-training model method based on a prompt paradigm so as to improve the accuracy of event slot filling. And finally, the result is iteratively corrected for multiple times, so that the overall extraction effect is improved.
(3) The invention uses a method of firstly judging the binary relation between argument candidates and then extracting the multivariate subgraph on the global candidate argument relation graph to finally obtain the event sketch with argument multivariate relation as the reference of multiple events. The extraction of multiple events can be realized under the condition of not marking the trigger words, the requirement for marking the data set is reduced, and the labor and time burden during the construction of the data set is reduced.
Drawings
FIG. 1 is a flowchart of a discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance.
Detailed Description
In order to more clearly illustrate the technical method provided by the invention, taking the chfinnann public event data set as an example, the implementation steps of the discourse-level multi-event extraction method based on the generation and guidance of the argument subgraph prompt are specifically illustrated.
As shown in fig. 1, the method of the present invention comprises the following four steps:
s1: extracting candidate arguments from the input text;
s2: extracting an event sketch contained in the input text;
s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;
s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iteratively repeating the step (S3) to obtain a corrected final event record.
Preferably, the step of S1 is as follows:
s11, processing the input event text into an input form conforming to a Longformer model, wherein the input form can be represented as a text sequence
Figure BDA0003487675070000051
The label of each element is labeled by the labeling mode of 'BIO'.
S12, for the input text sequence, after the input text sequence is coded by a coder based on a Longformer pre-training model, obtaining an intermediate vector HF:
Figure BDA0003487675070000052
s13, obtaining a final element representation vector ZF by the obtained intermediate vector HF through a full connection layer FC:
Figure BDA0003487675070000053
s14, calculating output label l of each element through softmax layeriA posteriori probability P (l)i|zfi) Wherein W issAnd bsAre trainable parameters:
P(li|zfi)=softmax(Wszfi+bs)
s15, outputting the label category with the maximum probability for each element in the sequence
Figure BDA0003487675070000054
Figure BDA0003487675070000055
And S16, analyzing the label sequence of the obtained text sequence elements by using a BIO label to obtain all candidate argument instances contained in the document, combining the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to the corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities. The final candidate argument entity set can be represented as a candidate argument
Figure BDA0003487675070000061
eiAll mentioned examples of (A) can be represented as
Figure BDA0003487675070000062
Preferably, the step of S2 is as follows:
s21: judging the relation between every two candidate arguments, modeling the relation judgment as a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category of a threshold category, and is specifically shown as S211-S217:
s211, for the input text sequence
Figure BDA0003487675070000063
Encoding by using a Longformer encoder to obtain an intermediate vector HS:
Figure BDA0003487675070000064
s212, candidate argument eiIncluding the reference to
Figure BDA0003487675070000065
Is represented by the sequence position of
Figure BDA0003487675070000066
Figure BDA0003487675070000067
Which represents the starting position of the device,
Figure BDA0003487675070000068
representing the end position. Aggregating to form the mentioned representation vectors using an average pooling
Figure BDA0003487675070000069
Figure BDA00034876750700000610
S213. through calculating candidate argument eiAverage pooling values of all mentioned representation vectors, calculating the representation direction of each candidate argumentMeasurement of
Figure BDA00034876750700000611
Figure BDA00034876750700000612
S214, sequentially selecting two different candidate arguments eiAnd ejConverted into a hidden vector by a linear layer and a non-linear layer tanh
Figure BDA00034876750700000613
And
Figure BDA00034876750700000614
Figure BDA00034876750700000615
Figure BDA00034876750700000616
wherein WiAnd WjIs a trainable parameter;
s215: computing probability P of relation category r through bilinear mapping bilinearrWhere σ denotes the softmax function, WrAnd brAre trainable parameters:
Figure BDA00034876750700000617
s216, training by using the following adaptive dynamic threshold loss function:
Figure BDA0003487675070000071
Figure BDA0003487675070000072
Ltotal=L1+L2
wherein the relationship set formed by the positive classes is CTThe relationship set composed of the negative classes is CN(ii) a The sign of the threshold class is Th; at L1Where r belongs to the positive class, r' belongs to the positive class and the threshold class, PrProbability of representing r class, Pr′Representing the probability of the r' category; at L2In, r' belongs to the negative class and the threshold class, Pr′Probability of representing r' class, PTHRepresenting a probability of a threshold class; l is1The probability of the loss-optimized positive class is greater than the threshold class, L2The penalty is such that the penalty for the threshold class is greater than the negative class; ultimate loss LtotalIs L1And L2The sum of (a);
s217: when predicting, judging whether the probability of each relation category is larger than the threshold class of the sample prediction to obtain whether the candidate argument has the relation of the category, wherein r represents a certain category, Rel represents candidate argument eiAnd ejThe relationship between:
if Pr(ei,ej)>PTh(ei,ej),then Rel(ei,ej)=r
and S22, constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein. This graph may be represented as an undirected graph, G ═ (V, E), where V represents a set of vertices,
Figure BDA0003487675070000073
each point viIs a candidate argument that the model has extracted; e represents a collection of edges, each (v)i,vj)∈E,(i,j≤NeI ≠ j) represents viAnd vjThere is a relationship between them, and the category of the relationship is R (v)i,vj)。
S23: extracting subgraphs from the global argument relationship graph, as shown in S231-S234:
s231. findTo all full k-values of k in G, { c }1,c2,…,cn};
S232, each complete subgraph of k-cliques is defined as a new vertex, and when the number of the same original vertices contained between every two new vertices is larger than or equal to k-1, the two new vertices are endowed with an edge, so that a new graph G can be formednewThen, the analysis will continue on the new graph;
s233, finding GnewAll complete subgraphs in (1);
and S234, all original vertexes contained in each complete subgraph form a subgraph, namely the candidate argument subgraph to be extracted finally.
And S24, constructing an event sketch according to the extracted subgraphs. All resulting event sketches can be represented as
Figure BDA0003487675070000081
Wherein each event sketch siIs tiI.e. the type of edge in the subgraph; event sketch siThe argument post-selection set contained in (A) can be expressed as
Figure BDA0003487675070000082
I.e. the set of all vertices in the candidate argument subgraph.
Preferably, the step of S3 is as follows:
s31, for event sketch siAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In, [ argument role 1]Is [ ans _ slot _1 ]]And [ argument role 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketchiAnd is used to indicate the argument role 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers, in this embodiment, each answer slot uses, for example," [ unused1][unused2]"patterns of these two identifiers, sequence number is incremented.
And S32, constructing an event sketch template. For eventsCandidate arguments contained in the sketch, which are translated into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ argument candidates 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch
Figure BDA0003487675070000088
“[RD]"is a specific separator, consisting of one or more predefined identifiers, which in this embodiment is used" [ unused80 ]]”。
And S33, splicing the event prompt template in the S31 and the event sketch template obtained in the S32, adding a prefix of 'CLS', and spacing by using 'SEP', so that an argument subpicture prompt can be formed.
S34: filling the event slot, wherein the specific steps are shown as S341-S349:
s341, splicing the argument subpicture prompt obtained in S33 with the text, and using "[ SEP ]]"As a spacer, the input text is represented as
Figure BDA0003487675070000083
S342. after longformer encoding, an intermediate vector HT is generated:
Figure BDA0003487675070000084
s343, extracting the obtained candidate argument
Figure BDA0003487675070000085
The positions of the event sketch and the original text part in the new text sequence are represented as
Figure BDA0003487675070000086
An interval. Obtaining a representation vector of candidate argument mentions by average pooling of intermediate vectors of all elements included in a candidate argument mention
Figure BDA0003487675070000087
Figure BDA0003487675070000091
S344, the expression vector of the candidate argument can be obtained by carrying out average pooling on all mentioned expression vectors of the candidate argument
Figure BDA0003487675070000092
Figure BDA0003487675070000093
S345. initialize a vector htNULLTo represent the case where the answer slot is not filled with any arguments. htNULLDimension of and
Figure BDA0003487675070000094
similarly, the vector parameters are continuously updated through training, so that the expression vector with the meaning of "NULL" can be learned. Will htNULLThe answer candidate can be obtained after splicing argument candidate expression vector, and can be expressed as
Figure BDA0003487675070000095
Wherein N iseIs the number of candidate arguments, when q is equal to NeAt the time of +1, the reaction solution,
Figure BDA0003487675070000096
if not, then,
Figure BDA0003487675070000097
s346, by performing average pooling on the intermediate vectors of all special placeholders forming a certain answer slot, a representation vector of the answer slot position can be obtained
Figure BDA0003487675070000098
Wherein
Figure BDA0003487675070000099
Representing event-based sketches siThe j-th answer slot of the sample, the calculation formula is as follows:
Figure BDA00034876750700000910
s347. for each answer slot, in all answer candidate vectors
Figure BDA00034876750700000911
The most suitable one is selected for filling, and the probability of each candidate answer is calculated according to the following formula, wherein Wk、WpAnd WuIs a trainable parameter, σ represents the softmax function:
Figure BDA00034876750700000912
Figure BDA00034876750700000913
Figure BDA00034876750700000914
s348, performing loss function calculation through cross entropy,
Figure BDA00034876750700000915
is a true tag:
Figure BDA00034876750700000916
s349, at the time of prediction, for a certain point
Figure BDA00034876750700000917
Selecting one of all answer candidates with the highest probability:
Figure BDA00034876750700000918
the theta answer candidate obtained by the formula is the answer that should be filled in the event slot. When the answer is a candidate argument, the argument role slot can be directly filled, and if the answer is 'NULL', the argument role slot is indicated to be kept vacant.
Preferably, the step of S4 is as follows:
s41, setting iterative correction times cnt;
s42, converting the event arguments contained in the obtained event records into a new event sketch;
and S43, filling the event slot again for the obtained new event sketch, and repeatedly iterating and filling the cnt for times, wherein the final result is the structured result of event extraction.
The chfinnann dataset contains five classes of events, respectively: stock Freeze (Equity Freeze, EF), stock Repurchase (ER), stock cutback (EU), stock overtightness (EO), and stock Pledge (EP). After the above steps of the present invention, the specific experimental results of the present method (DMEE-ASP) on this data set are shown in tables 1 and 2 below:
TABLE 1 event type Classification results
DMEE-ASP
Fraction frozen-F1 value 100.0
Stock repurchase-F1 value 100.0
Stock reduction-F1 value 99.3
Stock increase-F1 value 99.5
The value of the fraction pledge-F1 99.9
Table 2 event argument extraction experimental results
Figure BDA0003487675070000101
Figure BDA0003487675070000111
As can be seen from tables 1 and 2, the invention can complete the event type classification and the event argument extraction of the chapter-level multi-event extraction task without trigger words, and the experimental result is good, wherein the F1 value in the event type classification reaches more than 99, and the total F1 value in the event argument extraction reaches the best level at present. The method effectively improves the accuracy of the event extraction result, and has certain universality and superiority.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the appended claims.

Claims (8)

1. A discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance is characterized by comprising the following steps:
s1: extracting candidate arguments from the input text;
s2: extracting an event sketch contained in the input text;
s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;
s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iteratively repeating the step (S3) to obtain a corrected final event record.
2. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S1 comprises the following steps:
s11: in the training stage, the input event text is processed into an input form conforming to a Longformer model and is represented as a text sequence
Figure FDA0003487675060000011
Labeling the label of each element in a BIO labeling mode;
s12: for the text sequence D input in S11, after being encoded by an encoder based on a Longformer pre-training model, an intermediate vector HF is obtained:
Figure FDA0003487675060000012
s13: the intermediate vector HF obtained at S12 is passed through the full connection layer FC to obtain the final element representation vector ZF:
Figure FDA0003487675060000013
s14: passing through softmax layerTo calculate per-element output label/iA posteriori probability P (l)i|zfi) Wherein W issAnd bsAre trainable parameters:
P(li|zfi)=softmax(Wszfi+bs)
s15: outputting the label class with the highest probability for each element in the sequence
Figure FDA0003487675060000014
Figure FDA0003487675060000015
S16: analyzing a label sequence of the text sequence element obtained in the step S15 for a 'BIO' label to obtain all candidate argument instances contained in the document, merging the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities; and representing the obtained candidate argument entity set as candidate arguments
Figure FDA0003487675060000016
eiAll mentioned examples of (A) are represented by
Figure FDA0003487675060000017
3. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S2 comprises the following steps:
s21: judging pairwise relations between the candidate arguments, and modeling relation judgment into a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category 'threshold category';
s22: constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein; all-purposeThe graph of the office candidate argument relation is represented as an undirected graph G ═ V, E, where V represents the set of vertices,
Figure FDA0003487675060000021
(|V|=Ne) Each point viIs a candidate argument that has been extracted; e represents a collection of edges, each (v)i,vj)∈E,(i,j≤NeI ≠ j) represents viAnd vjThere is a relationship between them, and the category of the relationship is R (v)i,vj);
S23: extracting subgraphs from the global argument relation graph;
s24: constructing an event sketch according to the extracted subgraphs, and representing all the obtained event sketches as
Figure FDA0003487675060000022
Wherein each event sketch siIs tiI.e. the type of edge in the subgraph; event sketch siThe argument post-selection set contained in (A) is expressed as
Figure FDA0003487675060000023
I.e. the set of all vertices in the candidate argument subgraph.
4. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 3, wherein said step S21 comprises the following steps:
s211: for an input text sequence
Figure FDA0003487675060000024
Encoding by using a Longformer encoder to obtain an intermediate vector HS:
Figure FDA0003487675060000025
s212: candidate argument eiIncluding the reference to
Figure FDA0003487675060000026
Is represented by the sequence position of
Figure FDA0003487675060000027
Figure FDA0003487675060000028
Which represents the starting position of the device,
Figure FDA0003487675060000029
represents an end position; aggregating to form the mentioned representation vectors using means of average pooling (Mean)
Figure FDA00034876750600000210
Figure FDA00034876750600000211
S213: by computing candidate arguments eiAverage pooling values of all mentioned representative vectors, calculating a representative vector for each candidate argument
Figure FDA00034876750600000212
Figure FDA00034876750600000213
S214: two different candidate arguments e are selected in turniAnd ejConverted into a hidden vector by a linear layer and a non-linear layer tanh
Figure FDA00034876750600000214
And
Figure FDA00034876750600000215
Figure FDA0003487675060000031
Figure FDA0003487675060000032
wherein WiAnd WjIs a trainable parameter;
s215: computing probability P of relation category r by bilinear mapping bilinearrWhere σ denotes the softmax function, WrAnd brAre trainable parameters:
Figure FDA0003487675060000033
s216: training is performed using an adaptive dynamic threshold loss function as follows:
Figure FDA0003487675060000034
Figure FDA0003487675060000035
Ltotal=L1+L2
wherein the relationship set formed by the positive classes is CTThe relationship set composed of the negative classes is CN(ii) a The sign of the threshold class is Th; at L1Where r belongs to the positive class, r' belongs to the positive class and the threshold class, PrProbability of representing r class, Pr′Probability of representing r' category; at L2In, r' belongs to the negative class and the threshold class, Pr′Probability of representing r' class, PTHRepresenting a probability of a threshold class; l is1Probability of loss optimization positive class is greater thanClass of thresholds, L2The loss is such that the loss of the threshold class is greater than the negative class; ultimate loss LtotalIs L1And L2The sum of (a);
s217: when predicting, judging whether the probability of each relation category is larger than the threshold class of the sample prediction to obtain whether the candidate argument has the relation of the category, wherein r represents a certain category, Rel represents candidate argument eiAnd ejThe relationship between:
if Pr(ei,ej)>PTh(ei,ej),then Rel(ei,ej)=r。
5. the method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 3, wherein said step S23 comprises the following steps:
s231: finding all k-size complete subgraphs { c }in G1,c2,...,cn};
S232: defining each complete subgraph of k-cliques as a new vertex, and giving one edge to each two new vertices when the number of the same original vertices contained between the two new vertices is greater than or equal to k-1, thereby forming a new graph GnewThen, the analysis will continue on the new graph;
s233: find GnewAll complete subgraphs in (1);
s234: all the original vertices contained in each complete subgraph constitute a subgraph, i.e. the candidate argument subgraph to be extracted finally.
6. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S3 comprises the following steps:
s31: for event sketch siAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In, [ argument role 1]Is [ ans _ slot _1 ]]Angle of argumentColor 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketchiAnd is used to indicate the argument character 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers;
s32: constructing an event sketch template; for candidate arguments contained in the event sketch, it is converted into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ candidate argument 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch
Figure FDA0003487675060000049
“[RD]"is a specific delimiter, consisting of one or more predefined identifiers;
s33: splicing the event prompt template in the S31 with the event sketch template obtained in the S32, adding a prefix "[ CLS ]", and spacing by using "[ SEP ]", so as to form an event subgraph prompt;
s34: filling the event slot.
7. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 6, wherein said step S34 comprises the following steps:
s341: splicing the argument subpicture prompt obtained in S33 with the text, and using "[ SEP ]]"As a spacer, the input text is represented as
Figure FDA0003487675060000041
S342: after longformer coding, an intermediate vector HT is generated:
Figure FDA0003487675060000042
s343: candidate argument mentioned in S1
Figure FDA0003487675060000043
The positions of the event sketch and the original text part in the new text sequence are represented as
Figure FDA0003487675060000044
An interval; obtaining a representation vector of candidate argument mentions by average pooling of intermediate vectors of all elements included in a candidate argument mention
Figure FDA0003487675060000045
Figure FDA0003487675060000046
S344: obtaining the expression vector of the candidate argument by averagely pooling all mentioned expression vectors of the candidate argument
Figure FDA0003487675060000047
Figure FDA0003487675060000048
S345: initializing a vector htNULLTo represent the case where the answer slot is not filled with any argument; htNULLDimension of and
Figure FDA0003487675060000051
similarly, the vector parameters are continuously updated through training, so that a representation vector with the meaning of NULL is learned; will htNULLSplicing argument candidate expression vectors to obtain answer candidates represented as
Figure FDA0003487675060000052
Wherein N iseIs the number of candidate arguments, when q is equal to Ne+1When the utility model is used, the water is discharged,
Figure FDA0003487675060000053
if not, then,
Figure FDA0003487675060000054
s346: obtaining the expression vector of the answer slot position by performing average pooling on the intermediate vectors of all special placeholders forming a certain answer slot
Figure FDA0003487675060000055
Wherein
Figure FDA0003487675060000056
Representing event-based sketches siThe j-th answer slot of the sample is calculated as follows:
Figure FDA0003487675060000057
s347: for each answer slot, all answer candidate vectors
Figure FDA0003487675060000058
The most suitable one is selected for filling, and the probability of each candidate answer is calculated according to the following formula, wherein Wk、WpAnd WuIs a trainable parameter, σ represents the softmax function:
Figure FDA0003487675060000059
Figure FDA00034876750600000510
Figure FDA00034876750600000511
s348: the computation of the loss function is performed by cross entropy,
Figure FDA00034876750600000512
is a true tag:
Figure FDA00034876750600000513
s349: at the time of prediction, for a certain
Figure FDA00034876750600000514
Selecting one of all answer candidates with the highest probability:
Figure FDA00034876750600000515
the theta answer candidate obtained by the formula is the answer which should be filled in the event slot; when the answer is a candidate argument, the argument role slot is directly filled, and if the answer is 'NULL', the argument role slot is indicated to be kept vacant.
8. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S4 comprises the following steps:
s41: setting iterative correction times cnt;
s42: converting event arguments contained in the event record obtained in the step S3 into a new event sketch;
s43: and inputting the new event sketch obtained in the step S42 into the step S3, re-filling the event slots, and repeating iterative filling cnt for times, wherein the final result is the structured result of event extraction.
CN202210087670.6A 2022-01-25 2022-01-25 Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method Pending CN114519344A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210087670.6A CN114519344A (en) 2022-01-25 2022-01-25 Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210087670.6A CN114519344A (en) 2022-01-25 2022-01-25 Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method

Publications (1)

Publication Number Publication Date
CN114519344A true CN114519344A (en) 2022-05-20

Family

ID=81596722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210087670.6A Pending CN114519344A (en) 2022-01-25 2022-01-25 Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method

Country Status (1)

Country Link
CN (1) CN114519344A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757189A (en) * 2022-06-13 2022-07-15 粤港澳大湾区数字经济研究院(福田) Event extraction method and device, intelligent terminal and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757189A (en) * 2022-06-13 2022-07-15 粤港澳大湾区数字经济研究院(福田) Event extraction method and device, intelligent terminal and storage medium

Similar Documents

Publication Publication Date Title
CN111581961B (en) Automatic description method for image content constructed by Chinese visual vocabulary
CN110598005B (en) Public safety event-oriented multi-source heterogeneous data knowledge graph construction method
CN114169330B (en) Chinese named entity recognition method integrating time sequence convolution and transform encoder
CN112487143A (en) Public opinion big data analysis-based multi-label text classification method
CN112989841B (en) Semi-supervised learning method for emergency news identification and classification
CN111382565A (en) Multi-label-based emotion-reason pair extraction method and system
CN111401084A (en) Method and device for machine translation and computer readable storage medium
CN112612871A (en) Multi-event detection method based on sequence generation model
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN112699685B (en) Named entity recognition method based on label-guided word fusion
CN116245107B (en) Electric power audit text entity identification method, device, equipment and storage medium
CN116383399A (en) Event public opinion risk prediction method and system
CN115759119A (en) Financial text emotion analysis method, system, medium and equipment
Yang et al. Generative counterfactuals for neural networks via attribute-informed perturbation
CN116595406A (en) Event argument character classification method and system based on character consistency
CN111597816A (en) Self-attention named entity recognition method, device, equipment and storage medium
JP2022151838A (en) Extraction of open information from low resource language
CN114519344A (en) Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method
CN113901813A (en) Event extraction method based on topic features and implicit sentence structure
CN116108127A (en) Document level event extraction method based on heterogeneous graph interaction and mask multi-head attention mechanism
CN114239575B (en) Statement analysis model construction method, statement analysis method, device, medium and computing equipment
CN113221575B (en) PU reinforcement learning remote supervision named entity identification method
CN114462386A (en) End-to-end chapter event extraction method and system based on deep learning
CN115545038A (en) Aspect emotion analysis method for optimizing grid label
CN115062109A (en) Entity-to-attention mechanism-based entity relationship joint extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination