CN114519344A - Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method - Google Patents
Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method Download PDFInfo
- Publication number
- CN114519344A CN114519344A CN202210087670.6A CN202210087670A CN114519344A CN 114519344 A CN114519344 A CN 114519344A CN 202210087670 A CN202210087670 A CN 202210087670A CN 114519344 A CN114519344 A CN 114519344A
- Authority
- CN
- China
- Prior art keywords
- event
- argument
- candidate
- sketch
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A10/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
- Y02A10/40—Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a discourse level multi-event extraction method based on argument sub-graph prompt generation and guidance. The invention uses the chapter-level long text encoder to obtain the complete text characteristics, and can simultaneously utilize chapter-level information and sentence-level information. The multi-event referring and positioning are realized through the event sketch generated by the extraction of the multi-argument relation, and the argument classification is realized by filling the event slots by using a pre-training model method based on a prompt paradigm, so that the accuracy of multi-event extraction is improved. The method does not need to use trigger words, and reduces the labeling burden on the data set.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a discourse level multi-event extraction method based on argument sub-graph prompt generation and guidance.
Background
With the rapid development of internet technology, mass data are rushed into the lives of people. In order to rapidly process large-scale data and mine information with potential value in the data, the demand of people for information extraction technology is increasing. Event extraction is an important task in the field of information extraction, and aims to detect the occurrence of an event from an unstructured natural language text, judge the type of the event, extract important elements participating in the event, and present the result in a structured manner. The event extraction has wide application value, and on one hand, the event extraction can assist in providing structured multivariate relation information and brings performance improvement for machine reading understanding and knowledge graph construction. On the other hand, event extraction can help people understand the operation process of events and assist analysis and decision in the field of practical application.
Currently, most of research on event extraction utilizes a deep learning method to model event extraction as a sequence labeling problem. Firstly, extracting a trigger word, and if the trigger word is contained, considering that an event occurs. And then extracts arguments from the text. And finally, judging whether the trigger word is related to the argument or not so as to determine whether the argument belongs to the event referred by the trigger word or not. However, such a method has the following disadvantages:
1. only the information at the sentence level is concerned, and the information at the chapter level is ignored. In an actual scene, events have the characteristic of argument dispersion, and texts describing the events in one document are generally distributed in a plurality of sentences, so that chapter-level information needs to be considered to obtain a complete extraction result.
2. The accuracy rate is not high for multiple event extraction. In a document in the actual field, a plurality of events are often distributed in a staggered manner, the existing method relies on a trigger word to refer to an event to extract the plurality of events, the trigger word of a real scene is often difficult to judge, and the phenomena that one event comprises a plurality of trigger words, one trigger word corresponds to a plurality of event types, no obvious trigger word exists and the like exist. Therefore, the method relying on the trigger word may cause redundancy or omission of the extraction result, resulting in poor multi-event extraction effect.
3. The over-reliance on the trigger word brings burden to data annotation. The existing method usually takes a trigger word as a medium, but the trigger word is only an intermediate result of event extraction, is not necessary, and has very high labeling difficulty, thus increasing the burden of manually constructing a data set.
In summary, the existing technical solutions have the disadvantages of ignoring chapter-level information, low accuracy of multi-event extraction, and excessive dependence on trigger words.
Disclosure of Invention
Aiming at the defects of the prior method, the invention provides a novel discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance. The method uses a chapter-level long text encoder, and can simultaneously utilize chapter-level information and sentence-level information. The multi-event referring and positioning are realized through the event sketch generated by the extraction of the multi-argument relation, and the argument classification is realized by filling the event slots by using a pre-training model method based on a prompt paradigm, so that the accuracy of multi-event extraction is improved. The method does not need to use trigger words, and reduces the labeling burden on the data set.
The technical scheme of the invention is as follows:
a discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance comprises the following steps:
s1: extracting candidate arguments from the input text;
s2: extracting an event sketch contained in the input text;
s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;
s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iterating and repeating the step (S3) to obtain a corrected final event record.
As a preferred embodiment of the present invention, the step S1 includes the following steps:
s11: in the training stage, the input event text is processed into an input form conforming to a Longformer model and is represented as a text sequenceLabeling the label of each element in a BIO labeling mode;
s12: for the text sequence D input in S11, after being encoded by an encoder based on a Longformer pre-training model, an intermediate vector HF is obtained:
s13: the intermediate vector HF obtained at S12 is passed through the full connection layer FC to obtain the final element representation vector ZF:
s14: computing per-element output tag/through softmax layeriA posteriori probability P (l)i|zfi) Wherein W issAnd bsAre trainable parameters:
P(li|zfi)=softmax(Wszfi+bs)
S16: analyzing a label sequence of the text sequence element obtained in the step S15 for a 'BIO' label to obtain all candidate argument instances contained in the document, merging the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities; and representing the obtained candidate argument entity set as candidate argumentseiAll mentioned examples of (A) are indicated as
As a preferred embodiment of the present invention, the step S2 includes the following steps:
s21: judging pairwise relations between the candidate arguments, and modeling relation judgment into a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category 'threshold category';
s22: constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein; the global candidate argument relation graph is represented as an undirected graph G ═ (V, E), where V represents the set of vertices,each point viIs a candidate argument that has been extracted; e represents a collection of edges, each (v)i,vj)∈E,(i,j≤NeI ≠ j) represents viAnd vjThere is a relationship between them, and the category of the relationship is R (v)i,vj);
S23: extracting subgraphs from the global argument relation graph;
s24: constructing an event sketch according to the extracted subgraphs, and representing all the obtained event sketches asWherein each event sketch siIs tiI.e. the type of edge in the subgraph; event sketch siThe argument post-selection set contained in (A) is expressed asI.e. the set of all vertices in the candidate argument subgraph.
As a preferred embodiment of the present invention, the step S3 includes the following steps:
s31: for event sketch siAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In [ argument role 1 ]]Is [ ans _ slot _1 ]]And [ argument role 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketchiAnd is used to indicate the argument character 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers;
s32: constructing event sketch modelA plate; for candidate arguments contained in the event sketch, it is converted into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ argument candidates 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch“[RD]"is a specific delimiter, consisting of one or more predefined identifiers;
s33: splicing the event prompt template in the S31 with the event sketch template obtained in the S32, adding a prefix "[ CLS ]", and spacing by using "[ SEP ]", so as to form an event subgraph prompt;
s34: filling the event slot.
As a preferred embodiment of the present invention, the step S4 includes the following steps:
s41: setting iterative correction times cnt;
s42: converting event arguments contained in the event record obtained in the step S3 into a new event sketch;
s43: and inputting the new event sketch obtained in the step S42 into the step S3, re-filling the event slots, and repeating iterative filling cnt for times, wherein the final result is the structured result of event extraction.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention uses the pretraining model based on Longformer which can process the ultra-long text as the text encoder, thereby directly utilizing chapter-level information, realizing the circulation of the global information and the local information of the document and improving the integrity of the event extraction result.
(2) The invention constructs a frame method and improves the chapter-level multi-event extraction effect by applying various optimization modes. Firstly, an adaptive threshold method and a subgraph extraction method based on pedigree filtering are applied to construct an event sketch to refer to multiple events and obtain a preliminary event record. And then converting the event extraction task into a slot filling task, and introducing a large amount of background knowledge contained in a pre-training model by applying a pre-training model method based on a prompt paradigm so as to improve the accuracy of event slot filling. And finally, the result is iteratively corrected for multiple times, so that the overall extraction effect is improved.
(3) The invention uses a method of firstly judging the binary relation between argument candidates and then extracting the multivariate subgraph on the global candidate argument relation graph to finally obtain the event sketch with argument multivariate relation as the reference of multiple events. The extraction of multiple events can be realized under the condition of not marking the trigger words, the requirement for marking the data set is reduced, and the labor and time burden during the construction of the data set is reduced.
Drawings
FIG. 1 is a flowchart of a discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance.
Detailed Description
In order to more clearly illustrate the technical method provided by the invention, taking the chfinnann public event data set as an example, the implementation steps of the discourse-level multi-event extraction method based on the generation and guidance of the argument subgraph prompt are specifically illustrated.
As shown in fig. 1, the method of the present invention comprises the following four steps:
s1: extracting candidate arguments from the input text;
s2: extracting an event sketch contained in the input text;
s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;
s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iteratively repeating the step (S3) to obtain a corrected final event record.
Preferably, the step of S1 is as follows:
s11, processing the input event text into an input form conforming to a Longformer model, wherein the input form can be represented as a text sequenceThe label of each element is labeled by the labeling mode of 'BIO'.
S12, for the input text sequence, after the input text sequence is coded by a coder based on a Longformer pre-training model, obtaining an intermediate vector HF:
s13, obtaining a final element representation vector ZF by the obtained intermediate vector HF through a full connection layer FC:
s14, calculating output label l of each element through softmax layeriA posteriori probability P (l)i|zfi) Wherein W issAnd bsAre trainable parameters:
P(li|zfi)=softmax(Wszfi+bs)
And S16, analyzing the label sequence of the obtained text sequence elements by using a BIO label to obtain all candidate argument instances contained in the document, combining the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to the corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities. The final candidate argument entity set can be represented as a candidate argumenteiAll mentioned examples of (A) can be represented as
Preferably, the step of S2 is as follows:
s21: judging the relation between every two candidate arguments, modeling the relation judgment as a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category of a threshold category, and is specifically shown as S211-S217:
s211, for the input text sequenceEncoding by using a Longformer encoder to obtain an intermediate vector HS:
s212, candidate argument eiIncluding the reference toIs represented by the sequence position of Which represents the starting position of the device,representing the end position. Aggregating to form the mentioned representation vectors using an average pooling
S213. through calculating candidate argument eiAverage pooling values of all mentioned representation vectors, calculating the representation direction of each candidate argumentMeasurement of
S214, sequentially selecting two different candidate arguments eiAnd ejConverted into a hidden vector by a linear layer and a non-linear layer tanhAnd
wherein WiAnd WjIs a trainable parameter;
s215: computing probability P of relation category r through bilinear mapping bilinearrWhere σ denotes the softmax function, WrAnd brAre trainable parameters:
s216, training by using the following adaptive dynamic threshold loss function:
Ltotal=L1+L2
wherein the relationship set formed by the positive classes is CTThe relationship set composed of the negative classes is CN(ii) a The sign of the threshold class is Th; at L1Where r belongs to the positive class, r' belongs to the positive class and the threshold class, PrProbability of representing r class, Pr′Representing the probability of the r' category; at L2In, r' belongs to the negative class and the threshold class, Pr′Probability of representing r' class, PTHRepresenting a probability of a threshold class; l is1The probability of the loss-optimized positive class is greater than the threshold class, L2The penalty is such that the penalty for the threshold class is greater than the negative class; ultimate loss LtotalIs L1And L2The sum of (a);
s217: when predicting, judging whether the probability of each relation category is larger than the threshold class of the sample prediction to obtain whether the candidate argument has the relation of the category, wherein r represents a certain category, Rel represents candidate argument eiAnd ejThe relationship between:
if Pr(ei,ej)>PTh(ei,ej),then Rel(ei,ej)=r
and S22, constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein. This graph may be represented as an undirected graph, G ═ (V, E), where V represents a set of vertices,each point viIs a candidate argument that the model has extracted; e represents a collection of edges, each (v)i,vj)∈E,(i,j≤NeI ≠ j) represents viAnd vjThere is a relationship between them, and the category of the relationship is R (v)i,vj)。
S23: extracting subgraphs from the global argument relationship graph, as shown in S231-S234:
s231. findTo all full k-values of k in G, { c }1,c2,…,cn};
S232, each complete subgraph of k-cliques is defined as a new vertex, and when the number of the same original vertices contained between every two new vertices is larger than or equal to k-1, the two new vertices are endowed with an edge, so that a new graph G can be formednewThen, the analysis will continue on the new graph;
s233, finding GnewAll complete subgraphs in (1);
and S234, all original vertexes contained in each complete subgraph form a subgraph, namely the candidate argument subgraph to be extracted finally.
And S24, constructing an event sketch according to the extracted subgraphs. All resulting event sketches can be represented asWherein each event sketch siIs tiI.e. the type of edge in the subgraph; event sketch siThe argument post-selection set contained in (A) can be expressed asI.e. the set of all vertices in the candidate argument subgraph.
Preferably, the step of S3 is as follows:
s31, for event sketch siAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In, [ argument role 1]Is [ ans _ slot _1 ]]And [ argument role 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketchiAnd is used to indicate the argument role 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers, in this embodiment, each answer slot uses, for example," [ unused1][unused2]"patterns of these two identifiers, sequence number is incremented.
And S32, constructing an event sketch template. For eventsCandidate arguments contained in the sketch, which are translated into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ argument candidates 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch“[RD]"is a specific separator, consisting of one or more predefined identifiers, which in this embodiment is used" [ unused80 ]]”。
And S33, splicing the event prompt template in the S31 and the event sketch template obtained in the S32, adding a prefix of 'CLS', and spacing by using 'SEP', so that an argument subpicture prompt can be formed.
S34: filling the event slot, wherein the specific steps are shown as S341-S349:
s341, splicing the argument subpicture prompt obtained in S33 with the text, and using "[ SEP ]]"As a spacer, the input text is represented as
S342. after longformer encoding, an intermediate vector HT is generated:
s343, extracting the obtained candidate argumentThe positions of the event sketch and the original text part in the new text sequence are represented asAn interval. Obtaining a representation vector of candidate argument mentions by average pooling of intermediate vectors of all elements included in a candidate argument mention
S344, the expression vector of the candidate argument can be obtained by carrying out average pooling on all mentioned expression vectors of the candidate argument
S345. initialize a vector htNULLTo represent the case where the answer slot is not filled with any arguments. htNULLDimension of andsimilarly, the vector parameters are continuously updated through training, so that the expression vector with the meaning of "NULL" can be learned. Will htNULLThe answer candidate can be obtained after splicing argument candidate expression vector, and can be expressed asWherein N iseIs the number of candidate arguments, when q is equal to NeAt the time of +1, the reaction solution,if not, then,
s346, by performing average pooling on the intermediate vectors of all special placeholders forming a certain answer slot, a representation vector of the answer slot position can be obtainedWhereinRepresenting event-based sketches siThe j-th answer slot of the sample, the calculation formula is as follows:
s347. for each answer slot, in all answer candidate vectorsThe most suitable one is selected for filling, and the probability of each candidate answer is calculated according to the following formula, wherein Wk、WpAnd WuIs a trainable parameter, σ represents the softmax function:
s349, at the time of prediction, for a certain pointSelecting one of all answer candidates with the highest probability:
the theta answer candidate obtained by the formula is the answer that should be filled in the event slot. When the answer is a candidate argument, the argument role slot can be directly filled, and if the answer is 'NULL', the argument role slot is indicated to be kept vacant.
Preferably, the step of S4 is as follows:
s41, setting iterative correction times cnt;
s42, converting the event arguments contained in the obtained event records into a new event sketch;
and S43, filling the event slot again for the obtained new event sketch, and repeatedly iterating and filling the cnt for times, wherein the final result is the structured result of event extraction.
The chfinnann dataset contains five classes of events, respectively: stock Freeze (Equity Freeze, EF), stock Repurchase (ER), stock cutback (EU), stock overtightness (EO), and stock Pledge (EP). After the above steps of the present invention, the specific experimental results of the present method (DMEE-ASP) on this data set are shown in tables 1 and 2 below:
TABLE 1 event type Classification results
DMEE-ASP | |
Fraction frozen-F1 value | 100.0 |
Stock repurchase-F1 value | 100.0 |
Stock reduction-F1 value | 99.3 |
Stock increase-F1 value | 99.5 |
The value of the fraction pledge-F1 | 99.9 |
Table 2 event argument extraction experimental results
As can be seen from tables 1 and 2, the invention can complete the event type classification and the event argument extraction of the chapter-level multi-event extraction task without trigger words, and the experimental result is good, wherein the F1 value in the event type classification reaches more than 99, and the total F1 value in the event argument extraction reaches the best level at present. The method effectively improves the accuracy of the event extraction result, and has certain universality and superiority.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the appended claims.
Claims (8)
1. A discourse-level multi-event extraction method based on argument sub-graph prompt generation and guidance is characterized by comprising the following steps:
s1: extracting candidate arguments from the input text;
s2: extracting an event sketch contained in the input text;
s3: constructing an argument subpicture prompt based on the event sketch, and filling an event slot under the guidance of the argument subpicture prompt to form an event record;
s4: and (4) setting iteration times, converting the event record obtained in the step (S3) into a new event sketch, and iteratively repeating the step (S3) to obtain a corrected final event record.
2. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S1 comprises the following steps:
s11: in the training stage, the input event text is processed into an input form conforming to a Longformer model and is represented as a text sequenceLabeling the label of each element in a BIO labeling mode;
s12: for the text sequence D input in S11, after being encoded by an encoder based on a Longformer pre-training model, an intermediate vector HF is obtained:
s13: the intermediate vector HF obtained at S12 is passed through the full connection layer FC to obtain the final element representation vector ZF:
s14: passing through softmax layerTo calculate per-element output label/iA posteriori probability P (l)i|zfi) Wherein W issAnd bsAre trainable parameters:
P(li|zfi)=softmax(Wszfi+bs)
S16: analyzing a label sequence of the text sequence element obtained in the step S15 for a 'BIO' label to obtain all candidate argument instances contained in the document, merging the candidate argument instances through entity disambiguation and fusion, and associating the candidate argument instances to corresponding candidate argument entities, wherein each instance is called a mention of the candidate argument entities; and representing the obtained candidate argument entity set as candidate argumentseiAll mentioned examples of (A) are represented by
3. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S2 comprises the following steps:
s21: judging pairwise relations between the candidate arguments, and modeling relation judgment into a multi-label classification problem, wherein the relation category is equal to the event category plus an additional category 'threshold category';
s22: constructing a global candidate argument relation graph by using all the obtained candidate arguments and the relations therein; all-purposeThe graph of the office candidate argument relation is represented as an undirected graph G ═ V, E, where V represents the set of vertices,(|V|=Ne) Each point viIs a candidate argument that has been extracted; e represents a collection of edges, each (v)i,vj)∈E,(i,j≤NeI ≠ j) represents viAnd vjThere is a relationship between them, and the category of the relationship is R (v)i,vj);
S23: extracting subgraphs from the global argument relation graph;
s24: constructing an event sketch according to the extracted subgraphs, and representing all the obtained event sketches asWherein each event sketch siIs tiI.e. the type of edge in the subgraph; event sketch siThe argument post-selection set contained in (A) is expressed asI.e. the set of all vertices in the candidate argument subgraph.
4. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 3, wherein said step S21 comprises the following steps:
s211: for an input text sequenceEncoding by using a Longformer encoder to obtain an intermediate vector HS:
s212: candidate argument eiIncluding the reference toIs represented by the sequence position of Which represents the starting position of the device,represents an end position; aggregating to form the mentioned representation vectors using means of average pooling (Mean)
S213: by computing candidate arguments eiAverage pooling values of all mentioned representative vectors, calculating a representative vector for each candidate argument
S214: two different candidate arguments e are selected in turniAnd ejConverted into a hidden vector by a linear layer and a non-linear layer tanhAnd
wherein WiAnd WjIs a trainable parameter;
s215: computing probability P of relation category r by bilinear mapping bilinearrWhere σ denotes the softmax function, WrAnd brAre trainable parameters:
s216: training is performed using an adaptive dynamic threshold loss function as follows:
Ltotal=L1+L2
wherein the relationship set formed by the positive classes is CTThe relationship set composed of the negative classes is CN(ii) a The sign of the threshold class is Th; at L1Where r belongs to the positive class, r' belongs to the positive class and the threshold class, PrProbability of representing r class, Pr′Probability of representing r' category; at L2In, r' belongs to the negative class and the threshold class, Pr′Probability of representing r' class, PTHRepresenting a probability of a threshold class; l is1Probability of loss optimization positive class is greater thanClass of thresholds, L2The loss is such that the loss of the threshold class is greater than the negative class; ultimate loss LtotalIs L1And L2The sum of (a);
s217: when predicting, judging whether the probability of each relation category is larger than the threshold class of the sample prediction to obtain whether the candidate argument has the relation of the category, wherein r represents a certain category, Rel represents candidate argument eiAnd ejThe relationship between:
if Pr(ei,ej)>PTh(ei,ej),then Rel(ei,ej)=r。
5. the method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 3, wherein said step S23 comprises the following steps:
s231: finding all k-size complete subgraphs { c }in G1,c2,...,cn};
S232: defining each complete subgraph of k-cliques as a new vertex, and giving one edge to each two new vertices when the number of the same original vertices contained between the two new vertices is greater than or equal to k-1, thereby forming a new graph GnewThen, the analysis will continue on the new graph;
s233: find GnewAll complete subgraphs in (1);
s234: all the original vertices contained in each complete subgraph constitute a subgraph, i.e. the candidate argument subgraph to be extracted finally.
6. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S3 comprises the following steps:
s31: for event sketch siAnd constructing a corresponding event prompt template, wherein the construction method comprises the following steps: "is in [ event type]In, [ argument role 1]Is [ ans _ slot _1 ]]Angle of argumentColor 2]Is [ ans _ slot _2 ]]… ", where" [ event type]"is the type t of the event sketchiAnd is used to indicate the argument character 1]"and" [ argument role 2]"equal is the predefined argument role for this event type," [ ans _ slot _1]"and" [ ans _ slot _2]"is an answer slot, consisting of one or more predefined identifiers;
s32: constructing an event sketch template; for candidate arguments contained in the event sketch, it is converted into a text sequence using the following: "[ candidate argument 1][RD][ argument candidates 2][RD][ candidate argument 3]… ", where" [ candidate argument 1]"and" [ candidate argument 2]"equal is a candidate argument extracted from the event sketch“[RD]"is a specific delimiter, consisting of one or more predefined identifiers;
s33: splicing the event prompt template in the S31 with the event sketch template obtained in the S32, adding a prefix "[ CLS ]", and spacing by using "[ SEP ]", so as to form an event subgraph prompt;
s34: filling the event slot.
7. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 6, wherein said step S34 comprises the following steps:
s341: splicing the argument subpicture prompt obtained in S33 with the text, and using "[ SEP ]]"As a spacer, the input text is represented as
S342: after longformer coding, an intermediate vector HT is generated:
s343: candidate argument mentioned in S1The positions of the event sketch and the original text part in the new text sequence are represented asAn interval; obtaining a representation vector of candidate argument mentions by average pooling of intermediate vectors of all elements included in a candidate argument mention
S344: obtaining the expression vector of the candidate argument by averagely pooling all mentioned expression vectors of the candidate argument
S345: initializing a vector htNULLTo represent the case where the answer slot is not filled with any argument; htNULLDimension of andsimilarly, the vector parameters are continuously updated through training, so that a representation vector with the meaning of NULL is learned; will htNULLSplicing argument candidate expression vectors to obtain answer candidates represented asWherein N iseIs the number of candidate arguments, when q is equal to Ne+1When the utility model is used, the water is discharged,if not, then,
s346: obtaining the expression vector of the answer slot position by performing average pooling on the intermediate vectors of all special placeholders forming a certain answer slotWhereinRepresenting event-based sketches siThe j-th answer slot of the sample is calculated as follows:
s347: for each answer slot, all answer candidate vectorsThe most suitable one is selected for filling, and the probability of each candidate answer is calculated according to the following formula, wherein Wk、WpAnd WuIs a trainable parameter, σ represents the softmax function:
s349: at the time of prediction, for a certainSelecting one of all answer candidates with the highest probability:
the theta answer candidate obtained by the formula is the answer which should be filled in the event slot; when the answer is a candidate argument, the argument role slot is directly filled, and if the answer is 'NULL', the argument role slot is indicated to be kept vacant.
8. The method for discourse-level multi-event extraction based on argument sub-graph prompt generation and guidance as claimed in claim 1, wherein said step S4 comprises the following steps:
s41: setting iterative correction times cnt;
s42: converting event arguments contained in the event record obtained in the step S3 into a new event sketch;
s43: and inputting the new event sketch obtained in the step S42 into the step S3, re-filling the event slots, and repeating iterative filling cnt for times, wherein the final result is the structured result of event extraction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210087670.6A CN114519344A (en) | 2022-01-25 | 2022-01-25 | Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210087670.6A CN114519344A (en) | 2022-01-25 | 2022-01-25 | Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114519344A true CN114519344A (en) | 2022-05-20 |
Family
ID=81596722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210087670.6A Pending CN114519344A (en) | 2022-01-25 | 2022-01-25 | Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114519344A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114757189A (en) * | 2022-06-13 | 2022-07-15 | 粤港澳大湾区数字经济研究院(福田) | Event extraction method and device, intelligent terminal and storage medium |
-
2022
- 2022-01-25 CN CN202210087670.6A patent/CN114519344A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114757189A (en) * | 2022-06-13 | 2022-07-15 | 粤港澳大湾区数字经济研究院(福田) | Event extraction method and device, intelligent terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111581961B (en) | Automatic description method for image content constructed by Chinese visual vocabulary | |
CN110598005B (en) | Public safety event-oriented multi-source heterogeneous data knowledge graph construction method | |
CN114169330B (en) | Chinese named entity recognition method integrating time sequence convolution and transform encoder | |
CN112487143A (en) | Public opinion big data analysis-based multi-label text classification method | |
CN112989841B (en) | Semi-supervised learning method for emergency news identification and classification | |
CN111382565A (en) | Multi-label-based emotion-reason pair extraction method and system | |
CN111401084A (en) | Method and device for machine translation and computer readable storage medium | |
CN112612871A (en) | Multi-event detection method based on sequence generation model | |
CN111581368A (en) | Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network | |
CN112699685B (en) | Named entity recognition method based on label-guided word fusion | |
CN116245107B (en) | Electric power audit text entity identification method, device, equipment and storage medium | |
CN116383399A (en) | Event public opinion risk prediction method and system | |
CN115759119A (en) | Financial text emotion analysis method, system, medium and equipment | |
Yang et al. | Generative counterfactuals for neural networks via attribute-informed perturbation | |
CN116595406A (en) | Event argument character classification method and system based on character consistency | |
CN111597816A (en) | Self-attention named entity recognition method, device, equipment and storage medium | |
JP2022151838A (en) | Extraction of open information from low resource language | |
CN114519344A (en) | Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method | |
CN113901813A (en) | Event extraction method based on topic features and implicit sentence structure | |
CN116108127A (en) | Document level event extraction method based on heterogeneous graph interaction and mask multi-head attention mechanism | |
CN114239575B (en) | Statement analysis model construction method, statement analysis method, device, medium and computing equipment | |
CN113221575B (en) | PU reinforcement learning remote supervision named entity identification method | |
CN114462386A (en) | End-to-end chapter event extraction method and system based on deep learning | |
CN115545038A (en) | Aspect emotion analysis method for optimizing grid label | |
CN115062109A (en) | Entity-to-attention mechanism-based entity relationship joint extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |