CN114757189B - Event extraction method and device, intelligent terminal and storage medium - Google Patents

Event extraction method and device, intelligent terminal and storage medium Download PDF

Info

Publication number
CN114757189B
CN114757189B CN202210661693.3A CN202210661693A CN114757189B CN 114757189 B CN114757189 B CN 114757189B CN 202210661693 A CN202210661693 A CN 202210661693A CN 114757189 B CN114757189 B CN 114757189B
Authority
CN
China
Prior art keywords
event
vector
event type
argument
extracted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210661693.3A
Other languages
Chinese (zh)
Other versions
CN114757189A (en
Inventor
杨海钦
叶俊鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Digital Economy Academy IDEA
Original Assignee
International Digital Economy Academy IDEA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Digital Economy Academy IDEA filed Critical International Digital Economy Academy IDEA
Priority to CN202210661693.3A priority Critical patent/CN114757189B/en
Publication of CN114757189A publication Critical patent/CN114757189A/en
Application granted granted Critical
Publication of CN114757189B publication Critical patent/CN114757189B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an event extraction method, an event extraction device, an intelligent terminal and a storage medium, wherein the method comprises the following steps: acquiring sentences to be extracted, and performing word coding and position coding on each word to obtain corresponding word embedded vectors and position embedded vectors; adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into an encoder, and outputting a contextualized expression vector through the encoder; inputting the contextualized expression vector into a multi-label event type classifier to determine an event type embedding vector corresponding to a statement to be extracted, and acquiring a corresponding event type comprehensive vector; adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, and inputting the second input vector into an event argument classifier to obtain an event argument corresponding to the statement to be extracted; and constructing argument combinations according to the event arguments, classifying the argument combinations by events and determining the target event types of the argument combinations. The invention is beneficial to improving the efficiency of event extraction.

Description

Event extraction method and device, intelligent terminal and storage medium
Technical Field
The invention relates to the technical field of natural language processing, in particular to an event extraction method, an event extraction device, an intelligent terminal and a storage medium.
Background
With the development of science and technology, natural language processing technology is widely applied, and the application of event extraction is more and more extensive. An event refers to an occurrence, typically sentence-level, of one or more actions, and of one or more roles engaged, occurring within a particular time segment and territory. The event can be structured through event extraction, and the structured target is to determine the event type to which the event belongs and extract the event participants.
In the prior art, event extraction is performed by depending on a preset trigger word. The problem in the prior art is that attention to event type information of a sentence is lacked, and a set trigger word does not necessarily completely correspond to the sentence, which is not beneficial to improving the accuracy of event extraction.
Thus, there is still a need for improvement and development of the prior art.
Disclosure of Invention
The invention mainly aims to provide an event extraction method, an event extraction device, an intelligent terminal and a storage medium, and aims to solve the problems that in the prior art, a preset trigger word is relied on in an event extraction process, the attention to the event type information of a sentence is lacked, the set trigger word does not necessarily correspond to the sentence completely, and the accuracy of event extraction is not improved.
In order to achieve the above object, a first aspect of the present invention provides an event extraction method, wherein the event extraction method includes:
acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder;
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector;
adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining event arguments corresponding to the sentence to be extracted through the event argument classifier;
and constructing argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination, wherein the target event type corresponding to one argument combination is any one of a non-event type or an event type corresponding to the statement to be extracted.
Optionally, the sentence to be extracted corresponds to a plurality of event types, the dimension of the integrated vector of the event types is the same as the dimension of the contextualized expression vector, the contextualized expression vector is input to a pre-trained multi-tag event type classifier, an event type embedded vector corresponding to the sentence to be extracted is determined by the multi-tag event type classifier, and the integrated vector of the event types corresponding to the sentence to be extracted is obtained according to the event type embedded vector, including:
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, and determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier;
and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
Optionally, the obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector includes:
acquiring event probability corresponding to each event type, wherein the event probability is determined when the event type embedding vector corresponding to the statement to be extracted is determined through the multi-label event type classifier;
taking an event type with an event probability larger than a preset probability threshold value as a to-be-processed event type, and taking an event type embedding vector corresponding to the to-be-processed event type as a to-be-processed embedding vector;
and performing weighted summation on the embedding vectors to be processed corresponding to the event types to be processed to obtain the event type comprehensive vector, wherein the weighting coefficients corresponding to the embedding vectors to be processed are equal, or the event probability corresponding to the event types to be processed is taken as the weighting coefficient of the embedding vector to be processed corresponding to the event types to be processed.
Optionally, the obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector includes:
acquiring a weight matrix, a projection matrix and an event type embedding matrix, wherein the event type embedding matrix is acquired according to the event type embedding vector, the weight matrix is a matrix with m rows and 1 columns, the projection matrix and the event type embedding matrix are both matrices with m rows and d columns, m is the number of event types corresponding to the statement to be extracted, and d is the dimension of the event type embedding vector;
calculating and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the weight matrix, the projection matrix, the event type embedding matrix and the contextualized expression vector;
wherein, each element in the weight matrix is an event probability corresponding to each event type, or each element in the weight matrix is 1.
Optionally, firstiThe event type integrated vector is the product of the transpose matrix of the target matrix and the event type embedding matrix, the target matrix is obtained by the Hadamard product of the target product matrix and the weighting matrix, and the target product matrix is the firstiThe product of the contextualized expression vector and the projection matrix.
Optionally, the constructing of argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination includes:
acquiring word attributes corresponding to the event arguments, and combining the event arguments according to the word attributes corresponding to the event arguments to obtain a plurality of argument combinations, wherein each argument combination comprises a plurality of event arguments, and the word attributes corresponding to the event arguments in one argument combination are different;
and carrying out event classification on each argument combination and determining a target event type corresponding to each argument combination.
Optionally, the event classification for each argument combination and determining a target event type corresponding to each argument combination includes:
and inputting the argument combination into a pre-trained argument combination event type classifier, performing event classification on each event argument combination through the argument combination event type classifier, and acquiring a target event type corresponding to each argument combination.
A second aspect of the present invention provides an event extraction device, wherein the event extraction device includes:
the sentence processing module is used for acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
an embedded vector processing module, configured to add the word embedded vector and the position embedded vector to obtain a first input vector, input the first input vector into a pre-trained encoder, and output a contextualized expression vector of the to-be-extracted sentence through the encoder;
an event type determining module, configured to input the contextualized expression vector into a pre-trained multi-tag event type classifier, determine, by the multi-tag event type classifier, an event type embedding vector corresponding to the to-be-extracted statement, and obtain, according to the event type embedding vector, an event type comprehensive vector corresponding to the to-be-extracted statement;
an event argument extraction module, configured to add the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, input the second input vector into a pre-trained event argument classifier, and obtain an event argument corresponding to the to-be-extracted statement through the event argument classifier;
and the event argument processing module is used for constructing argument combinations according to the event arguments, classifying the arguments of each argument combination and determining a target event type corresponding to each argument combination, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted.
A third aspect of the present invention provides an intelligent terminal, where the intelligent terminal includes a memory, a processor, and an event extraction program stored in the memory and executable on the processor, and the event extraction program implements any one of the steps of the event extraction method when executed by the processor.
A fourth aspect of the present invention provides a computer-readable storage medium, in which an event extraction program is stored, and the event extraction program, when executed by a processor, implements any one of the steps of the event extraction method.
As can be seen from the above, in the scheme of the present invention, a sentence to be extracted is obtained, and word encoding and position encoding are performed on each word in the sentence to be extracted, so as to obtain a word embedded vector and a position embedded vector corresponding to the sentence to be extracted; adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder; inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector; adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining an event argument corresponding to the statement to be extracted through the event argument classifier; and constructing argument combinations according to the event arguments, performing event classification on the argument combinations and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted. Compared with the event extraction scheme which depends on the preset trigger words in the prior art, the event extraction method and the event extraction device have the advantages that the trigger words are not needed to be set, the event types of the sentences to be extracted are obtained, the event types are used as indicating information to be fused with the sentence information, and the event extraction accuracy is improved. Meanwhile, all event types are acted together to extract event arguments once, and then the obtained event arguments are arranged and combined to realize event type classification without extracting event arguments for multiple times, so that the efficiency of event extraction is improved, and the event types in the original sentence to be processed can be combined to better extract the event.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic flow chart of an event extraction method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a specific flow chart of event extraction according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating an example of event extraction according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating an example of event extraction according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating an event extraction according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an event extraction device according to an embodiment of the present invention;
fig. 7 is a schematic block diagram of an internal structure of an intelligent terminal according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when …" or "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings of the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
With the development of science and technology, natural language processing technology is widely applied, and the application of event extraction is more and more extensive. An event is a thing, typically sentence-level, that consists of one or more roles participating and one or more actions occurring within a certain time segment and territory. The event can be structured through event extraction, and the structured goal is to determine the event type to which the event belongs and extract the event participants, such as related entities, related time, related numerical values, and the like.
In the prior art, event extraction is performed by depending on a preset trigger word. The problem of the prior art is that attention to event type information of a sentence is lacked, and a set trigger word does not necessarily correspond to the sentence completely, which is not beneficial to improving the accuracy of event extraction.
Meanwhile, due to the complexity and diversity of language characters, a sentence may involve a plurality of events, and the event extraction method in the prior art lacks consideration for the situation of multiple events, easily causes the problems of incomplete event extraction and the like, and is not beneficial to improving the accuracy of event extraction.
In an application scenario, multi-step extraction is performed on the event parameters to be extracted according to the event types, so that the efficiency of event extraction is influenced. In another application scenario, direct association between event type identification and event parameter extraction (i.e. event argument extraction) is lacked, and event type information and event semantic information lack an in-depth information fusion mode during event parameter extraction, so that the accuracy of event extraction is affected.
In order to solve at least one of the problems, in the scheme of the invention, a sentence to be extracted is obtained, word coding and position coding are carried out on each word in the sentence to be extracted, and a word embedding vector and a position embedding vector corresponding to the sentence to be extracted are obtained; adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder; inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector; adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining an event argument corresponding to the statement to be extracted through the event argument classifier; and constructing argument combinations according to the event arguments, performing event classification on the argument combinations and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted.
Compared with the event extraction scheme which depends on the preset trigger words in the prior art, the event extraction method and the event extraction device have the advantages that the trigger words are not needed to be set, the event types of the sentences to be extracted are obtained, the event types are used as indicating information to be fused with the sentence information, and the event extraction accuracy is improved. Meanwhile, all event types act together to perform one-time event argument extraction, and then the obtained event arguments are arranged and combined to realize event type classification without performing multiple event argument extractions, so that the efficiency of event extraction is improved, and the event extraction can be better performed by combining all event types in the original sentence to be processed.
Exemplary method
As shown in fig. 1, an embodiment of the present invention provides an event extraction method, specifically, the method includes the following steps:
step S100, obtaining a sentence to be extracted, and performing word coding and position coding on each word in the sentence to be extracted to obtain a word embedding vector and a position embedding vector corresponding to the sentence to be extracted.
Specifically, the statement to be extracted is a statement that needs to be event extracted, and the statement to be extracted may correspond to one event type or a plurality of event types. It should be noted that one event type corresponds to one event type embedding vector.
Specifically, the sentence to be extracted includes a plurality of words, each word forms a word vector after being subjected to word encoding, the position encoding is to encode position information of the word in the sentence to be extracted, and the position vector can be formed after the position encoding. It should be noted that the position vector represents position information of each word, the position information is an appearance sequence of the words in the sentence to be extracted, and in an application scenario, the position information is an appearance sequence of the words in the sentence to be extracted according to a sequence from left to right.
After word vectors corresponding to the words are obtained, the word vectors are arranged according to the sequence of the words in the sentence to be extracted, and word embedded vectors corresponding to the sentence to be extracted are obtained; and then, arranging the position vectors according to the sequence of the words in the statement to be extracted to obtain the position embedded vector corresponding to the statement to be extracted. The term may be a chinese character, a word in a foreign language, or a character (e.g., arabic numeral), and is not limited in this respect.
Step S200, adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting the contextualized expression vector of the to-be-extracted sentence through the encoder.
The word embedding vector and the position embedding vector have the same number of elements and the same vector dimension. Specifically, in the present embodiment, the word embedding vector is composed of a plurality of word vectors sorted, the position embedding vector is composed of a plurality of position vectors sorted, and the number of word vectors in the word embedding vector is the same as the number of position vectors in the position embedding vector.
In step S200, the adding of the word embedding vector and the position embedding vector means adding each word vector in the word embedding vector and a position vector corresponding to each word vector, wherein a vector dimension of each word vector is equal to a vector dimension of the corresponding position vector. Therefore, the vector dimension of the word embedding vector and the vector dimension of the position embedding vector are equal to the vector dimension corresponding to the input sentence to be extracted. In this embodiment, the word embedding vector and the position embedding vector are added in a point-by-point addition manner.
Further, in this embodiment, the vector dimensions of each word vector and each position vector are the same, so that calculation is performed, and the efficiency of event extraction is improved.
In this embodiment, the pre-trained encoder is a pre-trained Transformer language model encoder (i.e., a Transformer encoder), and is not limited in detail herein. The input of the encoder (i.e. the first input vector) is composed of two parts, the first part is a word vector (word embedding) of a sentence text word after passing through a word embedding layer, and the second part is a position vector (position embedding) of position information after passing through a position embedding layer. After the first input vector is coded by a self attention mechanism of a transform language model, a contextualized expression (contextualized expressions) vector corresponding to a sentence to be extracted is output. The contextualized expression vector is the result of the model mapping the input data to the same dimensional space through an attention mechanism.
It should be noted that the Transformer language model is a machine learning model based on attention mechanism, and can process all words or symbols in the text in parallel, and simultaneously combine the context with the distant words by using the attention mechanism, and by processing all words in parallel, let each word pay attention to other words in the sentence in multiple processing steps. The input item of the Transformer encoder is a first input vector, and the output item is a contextualized expression vector of the statement to be extracted, wherein the contextualized expression vector of the statement to be extracted comprises contextualized expression vectors of all terms in the statement to be extracted.
FIG. 2 is a drawing of the present inventionIn an embodiment, a specific flow diagram of event extraction is provided, as shown in fig. 2, a to-be-extracted statement is "the present him to Baghdad and kill", and is encoded to obtain a word embedding vector and a position embedding vector, it should be noted that [ CLS]The start delimiter represents a start delimiter of a sentence, and the start delimiter can also be used as a word of the sentence to be extracted for position coding and word coding to obtain a position vector and a word vector corresponding to the start delimiter. As shown in fig. 2, in this embodiment, word coding is performed on a sentence to be extracted to obtain a plurality of word vectors, and the word vectors are arranged in order to obtain corresponding word embedding vectors
Figure 639935DEST_PATH_IMAGE001
(ii) a And carrying out position coding on the sentences to be extracted to obtain position vectors, and sequencing to obtain corresponding position embedded vectors. It should be noted that, in this embodiment, the to-be-extracted statement includes 7 words and 1 start delimiter, so that the obtained word embedding vector includes 8 word vectors, and in a specific use process, the number of the word vectors in the word embedding vector is determined according to an actual requirement, which is not specifically limited herein. Adding the word embedding vector and the position embedding vector point by point to obtain a first input vector, inputting the first input vector into a Transformer encoder to obtain a corresponding contextualized expression vector
Figure 88234DEST_PATH_IMAGE002
Wherein, in the step (A),
Figure 133550DEST_PATH_IMAGE003
is a flag bit [ CLS ] peculiar to a pre-training language model (namely, a Transformer language model)]May also be referred to as a flag bit [ CLS ]]The word hidden vector of (a) is,
Figure 794338DEST_PATH_IMAGE004
a contextualized expression vector (i.e. word hidden vector) representing the 1 st word in the sentence to be extracted,
Figure 139869DEST_PATH_IMAGE005
language representing the 2 nd word in the sentence to be extractedAnd transforming the expression vector, and so on, which is not described in detail.
Step S300, inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector.
In this embodiment, the multi-label event type classifier may be a full connection layer. For the task of text classification, the contextualized expression vector of a sentence is input into a multi-label event type classifier constructed by a full connection layer to perform multi-label classification (multi-label classification), and the multi-label event type classifier can output a plurality of event types and corresponding probabilities (probabilities). It should be noted that the multi-label classification means that for each input, there are more than one label.
In this embodiment, a statement to be extracted is subjected to multi-event extraction as an example, where the statement to be extracted corresponds to multiple event types, one event type comprehensive vector corresponds to one contextualized expression vector, and the dimensionality of the event type comprehensive vector is the same as that of the contextualized expression vector, so that each contextualized expression vector is respectively summed with one corresponding event type comprehensive vector.
Specifically, the step S300 includes: inputting the contextualized expression vector into a pre-trained multi-label event type classifier, and determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier; and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
In an application scenario, the multi-tag event type classifier directly outputs the corresponding event type (and the event probability corresponding to each event type), and then obtains the corresponding event type embedding vector according to the event type. Specifically, the contextualized expression vector is input into a pre-trained multi-label event type classifier, and an event type corresponding to the statement to be extracted is determined through the multi-label event type classifier; acquiring event type embedded vectors corresponding to the statements to be extracted according to the event types, wherein the event types correspond to the event type embedded vectors one to one; and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
In this embodiment, event extraction is divided into two stages, and multiple event extraction (multiple event extraction) is performed. As shown in fig. 2, the first stage in this embodiment corresponds to the left side in fig. 2, and performs event type classification by using an encoder and a multi-tag event type classifier, and finally obtains an event type comprehensive vector; the second stage corresponds to the right side in fig. 2, single-step event extraction is performed according to the event type comprehensive vector and the contextualized expression vector obtained in the first stage, event arguments in the statement to be extracted are obtained, and finally, the target event type corresponding to each argument combination is determined.
The event type embedded vector is a vector representation corresponding to each event type, fig. 3 is a specific flow diagram of event extraction provided in an embodiment of the present invention, and as shown in fig. 3, a multi-tag event type classifier is used to process a contextualized expression vector and obtain a plurality of event type embedded vectors corresponding to a vector to be extracted, such as event type a (c) (e.g., (c) (c)) (e)
Figure 493490DEST_PATH_IMAGE006
) Corresponding event type embedded vector
Figure 291682DEST_PATH_IMAGE007
Event type b (
Figure 756161DEST_PATH_IMAGE008
) Corresponding event type embedding vector
Figure 457663DEST_PATH_IMAGE009
Event type c (
Figure 716606DEST_PATH_IMAGE010
) Corresponding event type embedded vector
Figure 2094DEST_PATH_IMAGE011
And the like. Each event type corresponds to an event probability, that is, the probability that the statement to be extracted corresponds to the event type. For example, event probability corresponding to event type b in FIG. 3
Figure 4685DEST_PATH_IMAGE012
0.8, event probability corresponding to event type c
Figure 793650DEST_PATH_IMAGE013
And was 0.6. It should be noted that the event type embedding vector (event type embedding) is a trainable vector and may be used as an event indication vector, or a corresponding event type integrated vector is obtained according to the event type embedding vector, and the event type integrated vector is used as an event indication vector, and is subjected to information fusion with a word vector, so as to assist argument extraction at the second stage. Therefore, the two phases are associated, and the expandability is realized, and the overall architecture of the model cannot be influenced as the event types are increased.
FIG. 4 is a specific flowchart of an event extraction according to an embodiment of the present invention, and in an application scenario, as shown in FIG. 3 and FIG. 4, each contextualized expression vector corresponds to the same event type integrated vector
Figure 754652DEST_PATH_IMAGE014
Each contextualized expression vector is integrated with the event type vector
Figure 261857DEST_PATH_IMAGE014
And (4) carrying out addition.
In one application scenario, a probability threshold is preset. For example, it may be set to 0.5, but is not particularly limited. And for each event type, if the corresponding probability is greater than 0.5, the corresponding event type is considered to exist in the statement to be extracted. For example, in fig. 3 and 4, if the event probability of the event type b and the event type c is greater than 0.5, the event type b and the event type c are used as the event types to be processed, and the corresponding event type comprehensive vector is obtained according to the event types to be processed, so that the calculation amount is reduced, and the efficiency of event extraction is improved.
Specifically, the obtaining of the integrated vector of the event type corresponding to the statement to be extracted according to the embedded vector of the event type includes:
acquiring event probabilities corresponding to the event types, wherein the event probabilities are determined when the event types corresponding to the statements to be extracted are determined to be embedded into vectors through the multi-label event type classifier;
taking an event type with an event probability larger than a preset probability threshold value as a to-be-processed event type, and taking an event type embedding vector corresponding to the to-be-processed event type as a to-be-processed embedding vector;
and performing weighted summation on the to-be-processed embedded vectors corresponding to the to-be-processed event types to obtain the event type comprehensive vector, wherein the weighting coefficients corresponding to the to-be-processed embedded vectors are equal, or the event probability corresponding to the to-be-processed event types is used as the weighting coefficient of the to-be-processed embedded vector corresponding to the to-be-processed event types.
In an application scenario, as shown in fig. 3, the event probability corresponding to each event type to be processed is used as a weight coefficient of the to-be-processed embedded vector corresponding to each event type to be processed, and an event type comprehensive vector is obtained through calculation, specifically, the corresponding event type comprehensive vector may be obtained through calculation according to the following formula (1):
Figure 68139DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 977189DEST_PATH_IMAGE009
and
Figure 843514DEST_PATH_IMAGE011
the embedded vectors representing the event types corresponding to the event types b and c, i.e. the vectors to be processed,
Figure 605059DEST_PATH_IMAGE012
and
Figure 949453DEST_PATH_IMAGE013
the event probabilities corresponding to the event types b and c are respectively represented, and the embodiment is described by taking the case that only two event types to be processed exist as an example, but not limited specifically.
In another application scenario, the same weighting coefficients may be set for each of the to-be-processed embedded vectors. It should be noted that the weighting factor of each of the to-be-processed embedded vectors may be preset to be a same number (for example, both are set to be 1), or may be adjusted according to the number of to-be-processed embedded vectors (but keep equal to each other. As shown in fig. 4, the weighting factor may be set to be the reciprocal of the number of to-be-processed embedded vectors, and the to-be-extracted statement corresponds to 2 to-be-processed event types (i.e., corresponds to 2 to-be-processed embedded vectors), the weighting factor may be set to be half, and the corresponding event type comprehensive vector is obtained by calculation according to the following formula (2):
Figure 713009DEST_PATH_IMAGE016
wherein the content of the first and second substances,
Figure 15815DEST_PATH_IMAGE009
and
Figure 232032DEST_PATH_IMAGE011
and the embedded vectors respectively represent the event types corresponding to the event types b and c, so that the average value of the embedded vectors to be processed is used as an event type comprehensive vector, and the accuracy and efficiency of event extraction are improved.
Fig. 5 is a schematic specific flow chart of event extraction provided in this embodiment, and in an application scenario, as shown in fig. 5, the event types are integratedThe number of the vectors is the same as the number of the contextualized expression vectors corresponding to the statements to be extracted, the value of the event type comprehensive vector corresponding to each contextualized expression vector is determined according to the corresponding contextualized expression vector, and the values of the event type comprehensive vectors corresponding to different contextualized expression vectors may be different. It should be noted that, in fig. 5,
Figure 645696DEST_PATH_IMAGE017
represents the 1 st event type integrated vector, which is associated with the 1 st contextualized expression vector
Figure 263759DEST_PATH_IMAGE004
In response to this, the mobile terminal is allowed to,
Figure 737466DEST_PATH_IMAGE018
represents the firstiAn event type synthesis vector withiContextualized expression vector
Figure 440980DEST_PATH_IMAGE019
And (7) corresponding. Therefore, all event types corresponding to the statements to be extracted are comprehensively considered, and the accuracy of event extraction is improved.
Specifically, learning to obtain a corresponding event type comprehensive vector for each character in a sentence to be extracted, and obtaining the event type comprehensive vector corresponding to the sentence to be extracted according to the event type embedded vector includes: acquiring a weight matrix, a projection matrix and an event type embedding matrix, wherein the event type embedding matrix is acquired according to the event type embedding vector, the weight matrix is a matrix with m rows and 1 columns, the projection matrix and the event type embedding matrix are both matrices with m rows and d columns, m is the number of event types corresponding to the statement to be extracted, and d is the dimension of the event type embedding vector; calculating and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the weight matrix, the projection matrix, the event type embedding matrix and the contextualized expression vector; wherein, each element in the weight matrix is an event probability corresponding to each event type, or each element in the weight matrix is 1.
Specifically, the firstiThe event type integrated vector is the product of the transpose matrix of the target matrix and the event type embedding matrix, the target matrix is obtained by the Hadamard product of the target product matrix and the weighting matrix, and the target product matrix is the firstiThe product of the contextualized expression vector and the projection matrix.
As shown in FIG. 5, in one application scenario, the first calculation may be obtained according to the following equation (3)iEvent type comprehensive vector corresponding to characters
Figure 392755DEST_PATH_IMAGE018
Figure 130904DEST_PATH_IMAGE020
Wherein the content of the first and second substances,
Figure 509933DEST_PATH_IMAGE021
representing a projection matrix of shape (m, d),
Figure 467787DEST_PATH_IMAGE022
is the first in the contextualized expression vector output in the Transformer language modeliA vectorized expression (i.e. contextualized expression vector) corresponding to a single character,
Figure 957674DEST_PATH_IMAGE023
a matrix of the weights is represented by,
Figure 550329DEST_PATH_IMAGE024
represents a transposition of the matrix by a phase-shifting device,
Figure 100259DEST_PATH_IMAGE025
an embedded matrix representing the type of the event,
Figure 778365DEST_PATH_IMAGE026
representing the hadamard product. As shown in fig. 5,iCan be positive integer or [ CLS ]],iThe maximum value of (a) is determined according to the number of characters of the sentence to be extracted, and is not specifically limited herein. It should be noted that, for a word in a natural language, the d-dimensional vector is expressed as
Figure 337523DEST_PATH_IMAGE027
The specific values of the elements are determined according to actual requirements, and are not specifically limited herein.
In particular, the projection matrix
Figure 784685DEST_PATH_IMAGE021
Specifically, the event type is obtained by random initialization, wherein each row represents the projection parameters of each event type respectively; event type embedding matrix
Figure 505516DEST_PATH_IMAGE025
The method comprises the steps that an event type embedding vector corresponding to a statement to be extracted forms; weight matrix
Figure 936497DEST_PATH_IMAGE023
Formed by event probabilities corresponding to each event type, or weight matrices
Figure 768187DEST_PATH_IMAGE023
Each element in (1).
In an application scenario, a sentence to be extracted corresponds to 5 event types, a vector dimension corresponding to each event type (corresponding to an event type embedded vector) is 512, and event extraction needs to be performed on a sentence with a length of 20 characters, so that an event is extracted
Figure 102479DEST_PATH_IMAGE021
Is a projection matrix with the shape of (5, 512),
Figure 728632DEST_PATH_IMAGE025
an event type embedding matrix with the shape of (5, 512) is formed by embedding vectors for event types corresponding to 5 event types, and the 1 st word is usedSymbols, which are transform-encoded corresponding contextualized expression vectors (i.e., word hidden vectors)
Figure 381330DEST_PATH_IMAGE004
Is a matrix of (512,1) shape, which corresponds to the event type synthesis vector according to equation (3) above
Figure 282290DEST_PATH_IMAGE017
The shape is (1, 512).
As shown in formula (3), in this embodiment, the projection matrix and the second projection matrix are first combinediContextualized expression vector: (
Figure 438465DEST_PATH_IMAGE028
) Multiplying to obtain a target product matrix, then performing Hadamard multiplication on the target product matrix and a weight matrix (which can be probability distribution corresponding to all event types) to obtain a target matrix, and taking the product of the transposition of the target matrix and the event type embedding matrix as an event type comprehensive vector corresponding to the contextualized expression vector
Figure 766678DEST_PATH_IMAGE029
. Thus, the weighting parameters (i.e. the elements in the weight matrix) can be used to make the model learn a better weighting weight for each word and different contexts, so as to optimize the effect of event type embedding and improve the accuracy of event extraction.
It should be noted that a Hadamard product (Hadamard product) is a type of operation of a matrix, and a specific process is to multiply elements at corresponding positions in two matrices one by one. In another application scenario, each element in the weight matrix is 1, so that performing hadamard multiplication with the weight matrix is equivalent to multiplying each element in the original matrix by 1, and thus the weight matrix can be omitted. Therefore, when each element in the weight matrix is 1, the first one can be obtained by the following formula (4) calculationiEvent type comprehensive vector corresponding to each character
Figure 641093DEST_PATH_IMAGE018
Figure 345744DEST_PATH_IMAGE030
Thus, the amount of calculation can be reduced, thereby improving the efficiency of event extraction.
It should be noted that, in the process of calculating the event type comprehensive vector, the weighting weight of each event type embedding vector may be taken from the attention weight, so as to improve the accuracy of event extraction.
And step S400, adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining the event argument corresponding to the statement to be extracted through the event argument classifier.
In this embodiment, for multiple types of events extracted in the first stage, after weighting all the corresponding event type embedding vectors and obtaining corresponding event type integrated vectors, the event type integrated vectors and the contextualized expression vectors obtained in the first stage are added to be used as a second input vector during argument extraction, and argument extraction is performed. And combining and event pairing the extracted arguments to obtain the event arguments corresponding to each detected event type. Moreover, for the N detected event types, all event arguments can be extracted in one step without extracting each event type, so that the efficiency of event extraction is improved.
Specifically, in this embodiment, the second input vector is input into a pre-trained event argument classifier, and an event argument (i.e., an event parameter) corresponding to the to-be-extracted statement is obtained.
And S500, constructing argument combinations according to the event arguments, classifying the events of the argument combinations and determining the target event types corresponding to the argument combinations.
And one argument combination comprises at least one event argument, and the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted.
In this embodiment, the step S500 specifically includes: acquiring word attributes corresponding to the event arguments, and combining the event arguments according to the word attributes corresponding to the event arguments to obtain a plurality of argument combinations, wherein each argument combination comprises a plurality of event arguments, and the word attributes corresponding to the event arguments in one argument combination are different; and carrying out event classification on each argument combination and determining a target event type corresponding to each argument combination.
Further, the event classification for each argument combination and the determination of the target event type corresponding to each argument combination includes: and inputting the argument combination into a pre-trained argument combination event type classifier, performing event classification on each event argument combination through the argument combination event type classifier, and acquiring a target event type corresponding to each argument combination.
The event argument (i.e., event parameter) is a character or a character combination having a certain meaning of the event argument. One event argument corresponds to one word attribute, such as subject, predicate, object, complement, etc., and event arguments of different word attributes are combined. In the present embodiment, as shown in fig. 2 to fig. 5, the extracted subjects, predicates, and objects are taken as an example for explanation, where S1, S2, and S3 represent the extracted 3 subjects, P1, P2, and P3 represent the extracted 3 predicates, and O1, O2, and O3 represent the extracted 3 objects. The event-arguments are combined according to word attributes, for example, one argument combination includes at least a subject, a predicate, and an object. The argument combination event type classifier is a classifier trained in advance for classifying events of argument combinations. It should be noted that the multi-label event type classifier is a classifier for determining a target event type that may exist in a statement, and the function of the multi-label event type classifier is different from that of the argument combination event type classifier.
In an application scenario, one argument combination comprises a plurality of event arguments, wherein the word attributes corresponding to each event argument are different, and the number of the event arguments in one argument combination is the same as the number of the kinds of the word attributes. And after the corresponding argument combinations are obtained, event classification is carried out on each argument combination through a pre-trained argument combination event type classifier, and a target event type corresponding to each argument combination is determined. In this embodiment, the argument combination event type classifier used is an SPO event type classifier trained in advance for classifying argument combinations composed of principal and predicate guests, but is not limited specifically. Therefore, the event arguments obtained by extraction are combined and classified, and each target event type and the corresponding argument combination can be better determined.
It should be noted that there are various ways of combining and classifying event arguments, and the combination is not specifically limited herein. In an application scenario, for an argument combination, a multi-classifier and a beam search may be used for event classification, or an argument combination (or an event argument) is predicted to be scored through a ranking algorithm, and event types are ranked according to the scores, and a corresponding target event type is determined.
Specifically, all extracted event arguments are subjected to exhaustive permutation and combination, and corresponding argument combinations are obtained. For each argument combination, a first implicit vector of each argument is used
Figure 622005DEST_PATH_IMAGE031
And (4) overlapping or averaging, inputting the data into a classifier, and classifying the argument combination. If the argument combination is not a reasonable combination, the classifier classifies it as non-event and discards it.
For example, two subjects S1 and S2, two predicates P1 and P2, are detected in a text, and there are four possible combinations of event elements in this case, S1P1, S1P2, S2P1 and S2P2. Taking S1P1 as an example, the first-character hidden vector of S1 is
Figure 121119DEST_PATH_IMAGE032
The P1 first word implicit vector is
Figure 482830DEST_PATH_IMAGE033
Then will be
Figure 460014DEST_PATH_IMAGE034
And inputting the characteristic into a classifier, if the classifier judges that the event is not an event, discarding the S1P1 combination, otherwise, keeping the combination as a corresponding event type, and taking the corresponding event type as a target event type.
In one application scenario, for each argument combination, a first-letter vector of each argument is used
Figure 346106DEST_PATH_IMAGE031
And performing superposition or average, performing pairwise vector distance comparison with event type embedded vectors corresponding to all event types, distributing the combination to the event type with the closest distance, and taking the event type with the closest distance as a target event type.
Therefore, in the embodiment, when the multi-event extraction is carried out, the key words required by the traditional event extraction do not need to be marked, and a plurality of event types and corresponding parameters thereof can be extracted from one sentence. Meanwhile, event type classification and event argument extraction are combined, and a trainable event type comprehensive vector is set to serve as indication information to assist parameter extraction in the second stage. In the second stage, the event type integration vector and the contextualized expression vector are added to realize the embedding of the event type, so that the two stages are associated, and meanwhile, the expandability is realized, and the overall architecture of the model cannot be influenced along with the increase of the event types. In the single-step extraction process of the second stage, for the detected N events, all event arguments can be extracted by only one step, so that the process is more efficient. And in the second stage, the weighting weight is taken from the event probability corresponding to the event type in the first stage, and the information in the two stages is dynamically combined, so that the event extraction effect is improved.
In an application scenario, the encoder, the multi-label event type classifier, the event argument classifier and the argument combination event type classifier can be respectively trained in advance through respective training data, the training data can include input data and artificial labeling data, output data is obtained according to the input data in the training process, a loss value is calculated according to the output data and the artificial labeling data, and parameters of the encoder, the event argument classifier and the argument combination event type classifier are adjusted according to the loss value until a training condition is met (a preset iteration number is reached or the loss value is smaller than a corresponding threshold value). For example, when the argument combination event type classifier is trained, the training data comprises a plurality of argument combinations and artificially labeled real target event types corresponding to the argument combinations. And inputting the argument combinations in the training data into an argument combination event type classifier, obtaining output target event types corresponding to the argument combinations, calculating loss values according to the output target event types and the real target event types, and adjusting parameters of the argument combination event type classifier according to the loss values until preset training conditions are met.
In this embodiment, the encoder, the multi-label event type classifier, the event argument classifier, and the argument combined event type classifier belong to the same event extraction model, that is, the event extraction model includes four parts, namely, the encoder, the multi-label event type classifier, the event argument classifier, and the argument combined event type classifier, and the whole event extraction model is directly trained end to end according to training data including artificial labeling data, which is beneficial to reducing training amount and improving training accuracy.
Specifically, in this embodiment, the event extraction model is trained according to the following steps:
inputting training sentences in training data into an event extraction model, and outputting training target event types corresponding to the training sentences through the event extraction model, wherein the training data comprises a plurality of groups of training sentence information, and each group of training sentence information comprises a training sentence and a real target event type corresponding to the training sentence;
and adjusting model parameters of the event extraction model according to the training target event type and the real target event type corresponding to the training sentences, and continuing to execute the step of inputting the training sentences in the training data into the event extraction model until preset training conditions are met to obtain the trained event extraction model.
In the training process of the event extraction model, the specific data processing process of each component is similar to the specific processing process of the event extraction model, and is not described herein again.
In an application scenario, the event extraction model may further calculate loss values in three parts and train the loss values, that is, calculate corresponding multi-label event type classification loss, event argument classification loss and argument combination event type classification loss for a multi-label event type classifier, an event argument classifier and an argument combination event type classifier, sum the three losses to obtain a final loss, calculate a gradient using the final loss, and perform gradient propagation on the three parts to train the model. It should be noted that, at this time, the training data corresponding to the event extraction model includes multiple sets of training statement information, and each set of training statement information includes a training statement, a true event type (and a true event probability corresponding to each true event type, or a true event type embedded vector), a true event argument, and a true target event type, so as to calculate three corresponding loss values respectively.
As can be seen from the above, in the event extraction method provided by the embodiment of the present invention, the event type of the statement to be extracted is obtained without setting a trigger word, and the event type is used as the indication information to be fused with the statement information, which is beneficial to improving the accuracy of event extraction. Meanwhile, all event types act together to perform one-time event argument extraction, and then the obtained event arguments are arranged and combined to realize event type classification without performing multiple event argument extractions, so that the efficiency of event extraction is improved, and the event extraction can be better performed by combining all event types in the original sentence to be processed.
Exemplary device
As shown in fig. 6, corresponding to the event extraction method, an embodiment of the present invention further provides an event extraction device, where the event extraction device includes:
the sentence processing module 610 is configured to obtain a sentence to be extracted, perform word coding and position coding on each word in the sentence to be extracted, and obtain a word embedding vector and a position embedding vector corresponding to the sentence to be extracted.
An embedded vector processing module 620, configured to add the word embedded vector and the position embedded vector to obtain a first input vector, input the first input vector into a pre-trained encoder, and output a contextualized expression vector of the to-be-extracted sentence through the encoder.
An event type determining module 630, configured to input the contextualized expression vector into a pre-trained multi-tag event type classifier, determine, by the multi-tag event type classifier, an event type embedding vector corresponding to the to-be-extracted statement, and obtain, according to the event type embedding vector, an event type comprehensive vector corresponding to the to-be-extracted statement.
And an event argument extracting module 640, configured to add the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, input the second input vector into a pre-trained event argument classifier, and obtain an event argument corresponding to the to-be-extracted statement through the event argument classifier.
And an event argument processing module 650, configured to construct argument combinations according to the event arguments, perform event classification on the argument combinations, and determine a target event type corresponding to each argument combination, where a target event type corresponding to one argument combination is any one of a non-event and an event type corresponding to the to-be-extracted statement.
Specifically, in this embodiment, the specific functions of the event extraction device and each module thereof may refer to the corresponding descriptions in the event extraction method, and are not described herein again.
The event extraction device is not limited to the above-described event extraction device, and the module may be divided into different modules.
Based on the above embodiment, the present invention further provides an intelligent terminal, and a schematic block diagram thereof may be as shown in fig. 7. The intelligent terminal comprises a processor, a memory, a network interface and a display screen which are connected through a system bus. Wherein, the processor of the intelligent terminal is used for providing calculation and control capability. The memory of the intelligent terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and an event extraction program. The internal memory provides an environment for the operating system and the event extraction program in the nonvolatile storage medium to run. The network interface of the intelligent terminal is used for being connected and communicated with an external terminal through a network. The event extraction program realizes the steps of any one of the above event extraction methods when being executed by a processor. The display screen of the intelligent terminal can be a liquid crystal display screen or an electronic ink display screen.
It will be understood by those skilled in the art that the block diagram of fig. 7 is only a block diagram of a part of the structure related to the solution of the present invention, and does not constitute a limitation to the intelligent terminal to which the solution of the present invention is applied, and a specific intelligent terminal may include more or less components than those shown in the figure, or combine some components, or have different arrangements of components.
In one embodiment, an intelligent terminal is provided, where the intelligent terminal includes a memory, a processor, and an event extraction program stored in the memory and executable on the processor, and the event extraction program performs the following operations when executed by the processor:
acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder;
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector;
adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining an event argument corresponding to the statement to be extracted through the event argument classifier;
and constructing argument combinations according to the event arguments, performing event classification on the argument combinations and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted.
The embodiment of the present invention further provides a computer-readable storage medium, where an event extraction program is stored on the computer-readable storage medium, and when the event extraction program is executed by a processor, the steps of any one of the event extraction methods provided in the embodiment of the present invention are implemented.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working processes of the units and modules in the above-mentioned apparatus may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art would appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed system/terminal device and method can be implemented in other ways. For example, the above-described system/terminal device embodiments are merely illustrative, and for example, the division of the above modules or units is only one logical division, and the actual implementation may be implemented by another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
The integrated modules/units described above may be stored in a computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and can implement the steps of the embodiments of the method when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer readable medium may include: any entity or device capable of carrying the above-mentioned computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signal, telecommunication signal, software distribution medium, etc. It should be noted that the contents contained in the computer-readable storage medium can be increased or decreased as required by legislation and patent practice in the jurisdiction.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein.

Claims (10)

1. An event extraction method, characterized in that the event extraction method comprises:
acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the statement to be extracted through the encoder;
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector;
adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining event arguments corresponding to the statement to be extracted through the event argument classifier;
and constructing argument combinations according to the event arguments, performing event classification on the argument combinations, and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted, each argument combination comprises a plurality of event arguments, the word attributes corresponding to the event arguments in one argument combination are different, and the target event type corresponding to one argument combination is determined according to the first word vector of each event argument in the argument combination.
2. The event extraction method according to claim 1, wherein the to-be-extracted sentence corresponds to a plurality of event types, the dimension of the integrated vector of the event types is the same as the dimension of the contextualized expression vector, the contextualized expression vector is input to a pre-trained multi-tag event type classifier, an event type embedding vector corresponding to the to-be-extracted sentence is determined by the multi-tag event type classifier, and the integrated vector of the event types corresponding to the to-be-extracted sentence is obtained according to the event type embedding vector, including:
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, and determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier;
and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
3. The event extraction method according to claim 2, wherein the obtaining an event type integrated vector corresponding to the statement to be extracted according to the event type embedded vector comprises:
acquiring event probability corresponding to each event type, wherein the event probability is determined when the event type embedding vector corresponding to the statement to be extracted is determined through the multi-label event type classifier;
taking an event type with an event probability larger than a preset probability threshold value as a to-be-processed event type, and taking an event type embedding vector corresponding to the to-be-processed event type as a to-be-processed embedding vector;
and performing weighted summation on the to-be-processed embedded vectors corresponding to the to-be-processed event types to obtain the event type comprehensive vector, wherein the weighting coefficients corresponding to the to-be-processed embedded vectors are equal, or taking the event probability corresponding to the to-be-processed event types as the weighting coefficients of the to-be-processed embedded vectors corresponding to the to-be-processed event types.
4. The event extraction method according to claim 2, wherein the obtaining an event type integrated vector corresponding to the statement to be extracted according to the event type embedded vector comprises:
obtaining a weight matrix, a projection matrix and an event type embedded matrix, wherein the event type embedded matrix is obtained according to the event type embedded vector, the weight matrix is a matrix with m rows and 1 column, the projection matrix and the event type embedded matrix are both matrices with m rows and d column, m is the number of event types corresponding to the statements to be extracted, and d is the dimension of the event type embedded vector;
calculating and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the weight matrix, the projection matrix, the event type embedding matrix and the contextualized expression vector;
each element in the weight matrix is an event probability corresponding to each event type, or each element in the weight matrix is 1.
5. The event extraction method according to claim 4, wherein the first stepiThe event type comprehensive vector is the product of a transposed matrix of a target matrix and the event type embedded matrix, the target matrix is obtained by solving the Hadamard product of the target product matrix and the weight matrix, and the target product matrix is the firstiThe product of the contextualized expression vector and the projection matrix.
6. The event extraction method according to any one of claims 1 to 5, wherein the constructing argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination comprises:
acquiring word attributes corresponding to the event arguments, and combining the event arguments according to the word attributes corresponding to the event arguments to obtain a plurality of argument combinations;
and carrying out event classification on each argument combination and determining a target event type corresponding to each argument combination.
7. The event extraction method according to any one of claims 1 to 5, wherein the event classifying each argument combination and determining a target event type corresponding to each argument combination comprises:
and inputting the argument combination into a pre-trained argument combination event type classifier, performing event classification on each event argument combination through the argument combination event type classifier, and determining a target event type corresponding to each argument combination.
8. An event extraction device, characterized by comprising:
the sentence processing module is used for acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
the embedded vector processing module is used for adding the word embedded vector and the position embedded vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the statement to be extracted through the encoder;
the event type determining module is used for inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector;
the event argument extraction module is used for adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and acquiring event arguments corresponding to the statement to be extracted through the event argument classifier;
and the event argument processing module is used for constructing argument combinations according to the event arguments, performing event classification on the argument combinations and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted, each argument combination comprises a plurality of event arguments, the word attributes corresponding to the event arguments in one argument combination are different, and the target event type corresponding to one argument combination is determined according to the initial implicit vectors of the event arguments in the argument combination.
9. An intelligent terminal, characterized in that the intelligent terminal comprises a memory, a processor and an event extraction program stored on the memory and operable on the processor, the event extraction program, when executed by the processor, implementing the steps of the event extraction method according to any one of claims 1-7.
10. A computer-readable storage medium, in which an event extraction program is stored, which when executed by a processor implements the steps of the event extraction method according to any one of claims 1 to 7.
CN202210661693.3A 2022-06-13 2022-06-13 Event extraction method and device, intelligent terminal and storage medium Active CN114757189B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210661693.3A CN114757189B (en) 2022-06-13 2022-06-13 Event extraction method and device, intelligent terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210661693.3A CN114757189B (en) 2022-06-13 2022-06-13 Event extraction method and device, intelligent terminal and storage medium

Publications (2)

Publication Number Publication Date
CN114757189A CN114757189A (en) 2022-07-15
CN114757189B true CN114757189B (en) 2022-10-18

Family

ID=82337169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210661693.3A Active CN114757189B (en) 2022-06-13 2022-06-13 Event extraction method and device, intelligent terminal and storage medium

Country Status (1)

Country Link
CN (1) CN114757189B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186820B (en) * 2022-09-07 2023-01-10 粤港澳大湾区数字经济研究院(福田) Event coreference resolution method, device, terminal and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN110704598A (en) * 2019-09-29 2020-01-17 北京明略软件系统有限公司 Statement information extraction method, extraction device and readable storage medium
CN114385793A (en) * 2022-03-23 2022-04-22 粤港澳大湾区数字经济研究院(福田) Event extraction method and related device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222305B (en) * 2019-12-17 2024-03-22 共道网络科技有限公司 Information structuring method and device
CN111475617B (en) * 2020-03-30 2023-04-18 招商局金融科技有限公司 Event body extraction method and device and storage medium
CN111753522A (en) * 2020-06-29 2020-10-09 深圳壹账通智能科技有限公司 Event extraction method, device, equipment and computer readable storage medium
CN112163416B (en) * 2020-10-09 2021-11-02 北京理工大学 Event joint extraction method for merging syntactic and entity relation graph convolution network
CN112597366B (en) * 2020-11-25 2022-03-18 中国电子科技网络信息安全有限公司 Encoder-Decoder-based event extraction method
CN113312464B (en) * 2021-05-28 2022-05-31 北京航空航天大学 Event extraction method based on conversation state tracking technology
CN114519344A (en) * 2022-01-25 2022-05-20 浙江大学 Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method
CN114490953B (en) * 2022-04-18 2022-08-19 北京北大软件工程股份有限公司 Method for training event extraction model, method, device and medium for extracting event

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN110704598A (en) * 2019-09-29 2020-01-17 北京明略软件系统有限公司 Statement information extraction method, extraction device and readable storage medium
CN114385793A (en) * 2022-03-23 2022-04-22 粤港澳大湾区数字经济研究院(福田) Event extraction method and related device

Also Published As

Publication number Publication date
CN114757189A (en) 2022-07-15

Similar Documents

Publication Publication Date Title
CN111985369B (en) Course field multi-modal document classification method based on cross-modal attention convolution neural network
CN109446430B (en) Product recommendation method and device, computer equipment and readable storage medium
Ayache et al. Explaining black boxes on sequential data using weighted automata
Panda et al. Energy-efficient and improved image recognition with conditional deep learning
Berger Large scale multi-label text classification with semantic word vectors
CN107341510B (en) Image clustering method based on sparse orthogonality double-image non-negative matrix factorization
CN111460820A (en) Network space security domain named entity recognition method and device based on pre-training model BERT
Tripathy et al. Comprehensive analysis of embeddings and pre-training in NLP
CN113282714B (en) Event detection method based on differential word vector representation
CN112926655B (en) Image content understanding and visual question and answer VQA method, storage medium and terminal
CN114676704A (en) Sentence emotion analysis method, device and equipment and storage medium
CN114757189B (en) Event extraction method and device, intelligent terminal and storage medium
CN114490953A (en) Training event extraction model, event extraction method and target event extraction model
CN111966811A (en) Intention recognition and slot filling method and device, readable storage medium and terminal equipment
Abuassba et al. Classification with ensembles and case study on functional magnetic resonance imaging
Qian et al. Contrastive learning from label distribution: A case study on text classification
CN114048290A (en) Text classification method and device
Bhattarai et al. Tsetlin Machine Embedding: Representing Words Using Logical Expressions
Vo et al. Interpretable extractive text summarization with meta-learning and BI-LSTM: A study of meta learning and explainability techniques
CN114385793B (en) Event extraction method and related device
Heinsen An algorithm for routing capsules in all domains
CN115620342A (en) Cross-modal pedestrian re-identification method, system and computer
Brugiapaglia et al. Invariance, encodings, and generalization: learning identity effects with neural networks
CN112507081B (en) Similar sentence matching method, device, computer equipment and storage medium
CN113806747A (en) Trojan horse picture detection method and system and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant