CN116702765A - Event extraction method and device and electronic equipment - Google Patents

Event extraction method and device and electronic equipment Download PDF

Info

Publication number
CN116702765A
CN116702765A CN202310619154.8A CN202310619154A CN116702765A CN 116702765 A CN116702765 A CN 116702765A CN 202310619154 A CN202310619154 A CN 202310619154A CN 116702765 A CN116702765 A CN 116702765A
Authority
CN
China
Prior art keywords
event
vector
entity
text
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310619154.8A
Other languages
Chinese (zh)
Inventor
李健铨
穆晶晶
胡加明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dingfu Intelligent Technology Co ltd
Original Assignee
Dingfu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dingfu Intelligent Technology Co ltd filed Critical Dingfu Intelligent Technology Co ltd
Priority to CN202310619154.8A priority Critical patent/CN116702765A/en
Publication of CN116702765A publication Critical patent/CN116702765A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The application provides an event extraction method, an event extraction device and electronic equipment, wherein the event extraction method comprises the following steps: acquiring event prompt vectors by using a neural network model, wherein the event prompt vectors comprise a plurality of groups of token embedded vectors, and each group of token embedded vectors represents an event category; acquiring a text to be processed, and carrying out event classification on the text to be processed according to the event prompt vector to acquire event category; extracting all entity elements from the text to be processed; identifying entity elements corresponding to the event category from all entity elements according to the event prompt vector; and carrying out element role recognition on the entity elements corresponding to the event categories to obtain the role categories of the entity elements. The neural network model is used for acquiring the event prompt vector, and the event extraction is carried out on the text to be processed according to the event prompt vector, so that the action of the trigger word is effectively realized by using the event prompt vector, and the accuracy of the event extraction is improved.

Description

Event extraction method and device and electronic equipment
Technical Field
The application relates to the technical fields of natural language processing, prompt learning and event extraction, in particular to an event extraction method, an event extraction device and electronic equipment.
Background
Event Extraction (EE) is a classical information Extraction (Information Extraction, IE) task in the field of natural language processing (Natural Language Processing, NLP), mainly to extract interesting Event information from text data containing the Event information, and to express the Event in a natural language in a structured form, such as time, place, participation role and related action or state change of the Event.
Currently, event information is extracted from text content by using event extraction methods based on trigger words, for example: new release, union purchase or marketing release, etc. However, in a specific practical process, it is found that some special scenes have no trigger words at all, for example, text data in a warning situation and a recording scene exists in a dialogue form, and the dialogue form data has the characteristics of fuzzification and spoken language expression, so that it is difficult to identify clear trigger words from the dialogue form data. Therefore, the event extraction accuracy of data in special scenes without trigger words is low.
Disclosure of Invention
The embodiment of the application aims to provide an event extraction method, an event extraction device and electronic equipment, which are used for solving the problem of low accuracy of event extraction.
The embodiment of the application provides an event extraction method, which comprises the following steps: acquiring event prompt vectors by using a neural network model, wherein the event prompt vectors comprise a plurality of groups of token embedded vectors, each group of token embedded vectors represents an event category, and each group of token embedded vectors comprises a plurality of token embedded vectors; acquiring a text to be processed, and carrying out event classification on the text to be processed according to the event prompt vector to acquire event category; extracting all entity elements from the text to be processed; identifying entity elements corresponding to the event category from all entity elements according to the event prompt vector; and carrying out element role recognition on the entity elements corresponding to the event categories to obtain the role categories of the entity elements, wherein the role categories of the entity elements are used for generating an event record table. In the implementation process of the scheme, the neural network model is used for acquiring the event prompt vector, and the event extraction is carried out on the text to be processed according to the event prompt vector, so that the situation that clear trigger words are difficult to identify from dialogue form data is improved, the action of the trigger words is effectively realized by using the event prompt vector, and the accuracy of event extraction is improved.
Optionally, in an embodiment of the present application, acquiring the event prompt vector using the neural network model includes: acquiring a pre-constructed category label description statement, a category label interpretation definition statement, an element role name and an event key word; inputting the category label description statement, the category label interpretation definition statement, the element role name and the event key word into a neural network model to obtain a sentence representation vector output by the neural network model; an event hint vector is determined from a plurality of token embedded vectors in the sentence representation vector. In the implementation process of the scheme, the event prompt vector is obtained in a hard prompt (hard prompt) mode, namely, a sentence representation vector output by a model is obtained through a pre-constructed data input model such as a class label description statement, a class label interpretation definition statement, an element role name, an event key word and the like, and the event prompt vector is determined according to a plurality of token embedding vectors in the sentence representation vector, so that the neural network model is learned more thoroughly, and the accuracy of event extraction is improved.
Optionally, in an embodiment of the present application, determining the event hint vector according to a plurality of token embedded vectors in the sentence representation vector includes: screening out category token embedded vectors from a plurality of token embedded vectors in the sentence expression vectors, and determining the category token embedded vectors as event prompt vectors; or, carrying out maximum pooling treatment on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector; or, carrying out average pooling treatment on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector; or, performing minimum pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector.
Optionally, in an embodiment of the present application, acquiring the event prompt vector using the neural network model includes: acquiring an event matrix, wherein the event matrix is a matrix structure constructed according to the total number of event categories and a plurality of token embedded vectors of each event category, and learning the matrix structure by using a neural network model; for each event category, an event hint vector is determined from a plurality of token embedded vectors in an event matrix. In the implementation process of the scheme, the event prompt vector is obtained in a soft prompt (soft prompt) mode, namely, the event promotion vector is determined according to a plurality of token embedded vectors in a learnable event matrix, so that the neural network model is learned more thoroughly, and the accuracy of event extraction is improved.
Optionally, in an embodiment of the present application, the event matrix includes: a plurality of token embedded vectors; determining an event hint vector from a plurality of token embedded vectors in an event matrix, comprising: carrying out maximum pooling treatment on a plurality of token embedded vectors in the event matrix to obtain an event prompt vector; or, carrying out mean value pooling treatment on a plurality of token embedded vectors in the event matrix to obtain an event prompt vector; or, carrying out minimum pooling processing on a plurality of token embedded vectors in the event matrix to obtain an event prompt vector.
Optionally, in an embodiment of the present application, performing event classification on a text to be processed according to an event prompt vector includes: acquiring a text representation vector of a text to be processed and event categories of a plurality of groups of token embedded vectors in an event prompt vector; judging whether the similarity value between the text expression vector and each group of token embedded vectors is larger than a preset threshold value or not according to each group of token embedded vectors in the plurality of groups of token embedded vectors; if yes, determining the event category of the text to be processed as the event category of the token embedded vector. In the implementation process of the scheme, the event category of the event prompt vector is determined through the similarity value between the text expression vector and the token embedding vector, namely, the event classification process is realized according to the event prompt vector, and the event prompt vector can represent the event information (including the event category information), so that the accuracy of event classification can be improved, and the accuracy of event extraction is improved.
Optionally, in an embodiment of the present application, identifying, according to the event hint vector, an entity element corresponding to the event category from all entity elements includes: extracting all entity elements from each text sentence of the text to be processed, and converting all entity elements into a plurality of element vectors; for each element vector of the plurality of element vectors, determining whether a similarity value between the element vector and a set of token embedded vectors of the event hint vector is greater than a similarity threshold; if yes, determining the entity element corresponding to the element vector as the entity element corresponding to the event category of the token embedded vector. In the implementation process of the scheme, entity extraction is performed through the similarity value between the element vector and a group of token embedded vectors in the event prompt vector, namely, the entity element corresponding to the element vector is determined to be the entity element corresponding to the event category of the group of token embedded vectors.
Optionally, in an embodiment of the present application, before the event prompt vector is acquired using the neural network model, the method further includes: acquiring sample text and a sample label, wherein the sample label comprises: event category labels of sample texts, entity element labels, event entity relation labels and entity role labels, wherein the event category labels are event category sets included in the sample texts, the entity element labels are entity element sets extracted from sample sentences of the sample texts, the event entity relation labels are corresponding relation sets between entity elements and events, and the entity role labels are role categories of the entity elements in the events; and training the neural network model by taking the sample text as training data and taking event category labels, entity element labels, event entity relation labels and entity role labels of the sample text as training labels to obtain the neural network model. In the implementation process of the scheme, the neural network model is trained by simultaneously using the event category label, the entity element label, the event entity relation label and the entity role label of the sample text, so that the neural network model with better robustness is obtained, and the accuracy of event extraction can be improved by using the neural network model with better robustness.
Optionally, in an embodiment of the present application, training the neural network includes: acquiring event prompt vectors by using a neural network model, pooling all token embedded vectors in a sample text to acquire sample expression vectors, calculating text similarity between the event prompt vectors and the sample expression vectors, and determining a first loss value according to event category labels and the text similarity; extracting sample entity elements from each sample sentence of the sample text by using a neural network model, and calculating a second loss value between the sample entity elements and entity element labels of the sample sentences; pooling entity element tokens in the sample text to obtain entity representation vectors, calculating vector similarity between event prompt vectors corresponding to the sample text and the entity representation vectors, and determining a third loss value according to event entity relation labels and the vector similarity; extracting sample role categories of sample entity elements by using a neural network model, and calculating a fourth loss value between the sample role categories and entity role labels of the sample text; the neural network is trained based on the first loss value, the second loss value, the third loss value, and/or the fourth loss value. In the implementation process of the scheme, the neural network model is trained according to the first loss value, the second loss value, the third loss value and/or the fourth loss value, so that the neural network model with better robustness is obtained, and the accuracy of event extraction can be improved by using the neural network model with better robustness.
The embodiment of the application also provides an event extraction device, which comprises: the system comprises a prompt vector extraction module, a prompt vector extraction module and a prompt vector extraction module, wherein the prompt vector extraction module is used for obtaining an event prompt vector by using a neural network model, the event prompt vector comprises a plurality of groups of token embedded vectors, each group of token embedded vectors represents an event category, and each group of token embedded vectors comprises a plurality of token embedded vectors; the event classification obtaining module is used for obtaining the text to be processed, and carrying out event classification on the text to be processed according to the event prompt vector to obtain event classification; the entity element extraction module is used for extracting all entity elements from the text to be processed; the entity element identification module is used for identifying entity elements corresponding to the event category from all entity elements according to the event prompt vector; the role category obtaining module is used for carrying out element role identification on the entity elements corresponding to the event categories to obtain the role categories of the entity elements, wherein the role categories of the entity elements are used for generating the event record table.
Optionally, in an embodiment of the present application, the hint vector prediction module includes: the construction data acquisition sub-module is used for acquiring a pre-constructed category label description sentence, a category label interpretation definition sentence, an element role name and an event key word; the expression vector obtaining sub-module is used for inputting the category label description statement, the category label interpretation definition statement, the element role name and the event key word into the neural network model to obtain a sentence expression vector output by the neural network model; a first vector determination submodule for determining an event hint vector according to a plurality of token embedded vectors in the sentence representation vector.
Optionally, in an embodiment of the present application, the first vector determination submodule includes: a first vector determination unit, configured to screen out a category token embedded vector from a plurality of token embedded vectors in the sentence representation vector, and determine the category token embedded vector as an event prompt vector; or, the second vector determining unit is used for carrying out maximum pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector; or, the third vector determining unit is used for carrying out mean value pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector; or the fourth vector determining unit is used for carrying out minimum pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain the event prompt vector.
Optionally, in an embodiment of the present application, the hint vector prediction module includes: the event matrix acquisition sub-module is used for acquiring an event matrix, wherein the event matrix is a matrix structure constructed according to the total number of event categories and a plurality of token embedded vectors of each event category, and the event matrix is obtained by learning the matrix structure by using a neural network model; and the second vector determination sub-module is used for determining an event prompt vector according to a plurality of token embedded vectors in the event matrix for each event category.
Optionally, in an embodiment of the present application, the event matrix includes: a plurality of token embedded vectors; a second vector determination submodule comprising: the fifth vector determining unit is used for carrying out maximum pooling processing on a plurality of token embedded vectors in the event matrix to obtain event prompt vectors; or, the sixth vector determining unit is used for carrying out mean value pooling processing on the token embedded vectors in the event matrix to obtain an event prompt vector; or the seventh vector determining unit is used for carrying out minimum pooling processing on the token embedded vectors in the event matrix to obtain the event prompt vector.
Optionally, in an embodiment of the present application, the event classification obtaining module includes: the event category acquisition sub-module is used for acquiring text expression vectors of the text to be processed and event categories of a plurality of groups of token embedding vectors in the event prompt vectors; the text vector judging sub-module is used for judging whether the similarity value between the text representing vector and the event prompting vector is larger than a preset threshold value or not according to each group of token embedding vectors in the plurality of groups of token embedding vectors; and the event type determining sub-module is used for determining the event type of the text to be processed as the event type of the event prompt vector if the text expression vector and the set of token embedded vectors are larger than a preset threshold value.
Optionally, in an embodiment of the present application, the entity element identification module includes: the element extraction and conversion sub-module is used for extracting all entity elements from each text sentence of the text to be processed and converting all entity elements into a plurality of element vectors; a similarity value judging sub-module, configured to judge, for each element vector of the plurality of element vectors, whether a similarity value between the element vector and a set of token embedded vectors of the event prompt vector is greater than a similarity threshold; and the entity element determining sub-module is used for determining the entity element corresponding to the element vector as the entity element corresponding to the event category of the token embedded vector if the similarity value between the element vector and the event prompt vector is larger than the similarity threshold value.
Optionally, in an embodiment of the present application, the role category obtaining module includes: and the element role identification sub-module is used for carrying out element role identification on the entity elements corresponding to the event categories by using the feedforward neural FFN network.
Optionally, in an embodiment of the present application, the event extraction device further includes: the text label acquisition module is used for acquiring sample text and sample labels, and the sample labels comprise: event category labels of sample texts, entity element labels, event entity relation labels and entity role labels, wherein the event category labels are event category sets included in the sample texts, the entity element labels are entity element sets extracted from sample sentences of the sample texts, the event entity relation labels are corresponding relation sets between entity elements and events, and the entity role labels are role categories of the entity elements in the events; the neural network training module is used for training the neural network by taking the sample text as training data and taking the event category label, the entity element label, the event entity relation label and the entity role label of the sample text as training labels to obtain a neural network model.
Optionally, in an embodiment of the present application, the neural network training module includes: the first loss calculation sub-module is used for acquiring event prompt vectors by using a neural network, pooling all token embedded vectors in the sample text to obtain sample expression vectors, calculating text similarity between the event prompt vectors and the sample expression vectors, and calculating the text similarity according to event category labels of the sample text to obtain a first loss value; the second loss calculation sub-module is used for extracting sample entity elements from each sample sentence of the sample text by using the neural network, and calculating a second loss value between the sample entity elements of the sample sentence and the entity element labels of the sample sentence; the third loss calculation sub-module is used for pooling entity element tokens in the sample text to obtain entity representation vectors, calculating vector similarity between event prompt vectors corresponding to the sample text and the entity representation vectors, and calculating the vector similarity according to event entity relation labels to obtain a third loss value; a fourth loss calculation sub-module, configured to extract a sample role category of a sample entity element using a neural network, and calculate a fourth loss value between the sample role category and an entity role label of the sample text; the neural network training sub-module is used for training the neural network according to the first loss value, the second loss value, the third loss value and/or the fourth loss value.
Optionally, in an embodiment of the present application, the neural network model includes: BERT model, roberta model, uniLM model, XLNET model, RNN model, CNN model, or a model based on a Transformer structure.
The embodiment of the application also provides electronic equipment, which comprises: a processor and a memory storing machine-readable instructions executable by the processor to perform the method as described above when executed by the processor.
Embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs a method as described above.
Additional features and advantages of embodiments of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of embodiments of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application, and therefore should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort to those of ordinary skill in the art.
Fig. 1 is a schematic flow chart of an event extraction method according to an embodiment of the present application;
fig. 2 is a schematic diagram of an event extraction process of a text to be processed according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a network structure of a pre-training language model of a transducer class according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of training a neural network model according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an event extraction device according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the embodiments of the present application are only for the purpose of illustration and description, and are not intended to limit the scope of the embodiments of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in the embodiments of the present application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to or removed from the flow diagrams by those skilled in the art under the direction of the teachings of the embodiments of the present application.
In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Accordingly, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the claimed embodiments of the application, but is merely representative of selected embodiments of the application.
It will be appreciated that "first" and "second" in embodiments of the application are used to distinguish similar objects. It will be appreciated by those skilled in the art that the words "first," "second," etc. do not limit the number and order of execution, and that the words "first," "second," etc. do not necessarily differ. In the description of the embodiments of the present application, the term "and/or" is merely an association relationship describing an association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. The term "plurality" refers to two or more (including two), and similarly, "plurality" refers to two or more (including two).
Before describing the event extraction method provided by the embodiment of the present application, some concepts related in the embodiment of the present application are described first:
prompt Learning (Prompt Learning), also known as Prompt-based Learning (Prompt Learning), refers to a Learning paradigm in the NLP field that uses a pre-trained language model to perform predictive tasks, the Learning paradigm consisting essentially of: pre-trained), prompt (prompt), and prediction (prediction), which can replace the traditional fine-tuning based learning paradigm, which may include: pre-trained) and fine-tuning (fine-tuning).
The bi-directional coding means deformer (Bidirectional Encoder Representations from Transformers, BERT), which is a language expression layer (presentation) model, and when the BERT is operated by using the bi-directional coding means coding layer, operations such as coding (encodings), decoding (decoders), self-attention (self-attention) mechanism, deformation (transformation) and the like can be operated by using the BERT model.
A bi-directional and auto-regressive deformer (Bidirectional and Auto-RegressiveTransformers, BART) refers to a de-noising auto-encoder with a sequence-to-sequence model that can be adapted for a very wide range of natural language processing tasks, with BART combining a bi-directional transformer with an auto-regressive transformer for pre-training.
The Pre-training language model (Pre-training Language Models, PLM), which is also called as Pre-training model for short, refers to a neural network model obtained by taking a large amount of text corpus as training data and performing semi-supervised machine learning on the neural network by using the training data.
It should be noted that, the event extraction method provided in the embodiment of the present application may be executed by an electronic device, where the electronic device refers to a device terminal or a server having a function of executing a computer program, where the device terminal is for example: smart phones, personal computers, tablet computers, personal digital assistants, or mobile internet appliances, etc. A server refers to a device that provides computing services over a network, such as: an x86 server and a non-x 86 server, the non-x 86 server comprising: mainframe, minicomputer, and UNIX servers.
Application scenarios to which the event extraction method is applicable are described below, where the application scenarios include, but are not limited to: the event extraction method can be used for extracting events from special scene texts without trigger words or with difficult recognition of trigger words, wherein the special scene texts include but are not limited to: alert transcript text, speech recognized meeting text, man-machine dialog text, and the like. Of course, the event extraction method can also be used for carrying out event detection (event detection), entity extraction (entity extraction), combined extraction (combination extraction), entity record (entity record) and other event extraction tasks on the text to be processed, wherein the event detection refers to that all possible event categories (event types) in the text are identified through various classification labels; entity extraction, also referred to as element extraction (argument extraction) or entity element extraction, refers to extracting entity elements of a detected event from text and encoding the entity elements into dense vectors; the combination extraction means that the combination between the entity element and the event is extracted, namely the combination indicates that the entity element belongs to a specific event; the entity record is also called an element mark (argument labeling) or an element role mark (argument role labling), which refers to that the role category of the entity element in the event is determined by combining the event type with the combination between the entity element and the event, and a final entity record (entity record) or an entity record table (entity record table) is generated according to the role category of the entity element in the event.
Please refer to fig. 1, which is a flowchart illustrating a method for event extraction according to an embodiment of the present application; the event extraction method has the main thought that event extraction is carried out by a Prompt (Prompt) learning mode, the action of trigger words is effectively realized by using event Prompt vectors, and the situation that event extraction cannot be carried out due to difficulty in identifying clear trigger words from certain data such as dialogue-form data is avoided.
Please refer to fig. 2, which illustrates a schematic diagram of an event extraction process of a text to be processed according to an embodiment of the present application; the specific content of the text to be processed may be "the HG company has contracted for a contract at 10/16 of 2016". The contract specifies that the HG company mortises 8 tens of thousands of his stocks to the WL company). After the event extraction is carried out on the text to be processed by the method provided by the embodiment of the application, the event category is a "mortgage" (step), and the entity elements corresponding to the event category comprise: HG company, WL company, 8 ten thousand Shares and "10 month and 16 day in 2016", wherein the element role of HG company (term role) is a mortgage person (Pledger), the element role of WL company is a quality rights person (pledget), the element role of 8 ten thousand Shares is a quality rights share (Shares), and the element role of "10 month and 16 day in 2016" is a start date.
The embodiment of the event extraction method may include:
step S110: an event hint vector is obtained using a neural network model, the event hint vector comprising a plurality of sets of token embedding vectors, each set of token embedding vectors characterizing an event category, the each set of token embedding vectors comprising a plurality of token embedding vectors.
It will be appreciated that, as a result of the implementation of step S110 described above, there are a wide variety of implementations, including but not limited to: the event alert vectors are predictively extracted using a Hard alert (Hard alert) approach, where event alert vectors are obtained using an artificially designed or constructed alert template, and a Soft alert (Soft alert) approach, where event alert vectors are obtained by concatenating a learnable embedded (empadd) vector before the text represents the vector. Therefore, an embodiment of step S110 will be described in detail below.
Step S120: and acquiring the text to be processed, and carrying out event classification on the text to be processed according to the event prompt vector to acquire event category.
The text to be processed refers to a text including a plurality of text sentences, each text to be processed including a plurality of text sentences, each text sentence corresponding to one or more event categories, and therefore each text to be processed includes: text statements for multiple event categories. When each text sentence corresponds to a plurality of event categories, the task may be understood as a multi-event classification task, i.e., a multi-tag classification task.
Event category (Event Type), also referred to as Event Type, refers to the result of classifying an Event in a text to be processed, and for example, the text to be processed may include: and purchase event, stakeholder deduction and mortgage event, etc.
It will be appreciated that the event hint vector is capable of characterizing event information, that is, the event hint vector includes a plurality of sets of token embedded vectors, each set of token embedded vectors characterizing an event category (the event information includes an event category), specifically for example: the similarity between one group of token embedded vectors in the event prompt vectors and the event category characterization vectors of the "mortgage" (which can be understood as a group of token embedded vectors of a specific event category) is greater than the similarity between other event category characterization vectors, and at this time, event classification can be performed according to the similarity. Thus, the text to be processed may be event classified according to the event hint vector, thereby obtaining an event class, which is the result of the event classification (event classification) described above, whereas the event classification (event classification) is a subtask of event detection.
Step S130: and extracting all entity elements from the text to be processed.
Step S140: and identifying the entity element corresponding to the event category from all the entity elements according to the event prompt vector.
It will be appreciated that, since the event hint vector can characterize not only event information, but also entity elements in a specific event, for example: since the event prompt vector is similar to the event category characterization vector of the "mortgage", it can be presumed that the event prompt vector is similar to the characterization vector of the "mortgage", and the event prompt vector is similar to the characterization vector of the "quality weight". Wherein, the above entity elements are as follows: "mortgage" or "right of quality" and so on. Therefore, the entity element corresponding to the event category can be extracted from the text to be processed according to the event prompt vector.
Step S150: and marking the role of the element on the entity element corresponding to the event category to obtain the role category of the entity element, wherein the role category of the entity element is used for generating the event record table.
It will be appreciated that the role categories of the entity element may be used to generate an event record table (generate event records).
In the implementation process, the neural network model is used for acquiring the event prompt vector, and the event extraction is carried out on the text to be processed according to the event prompt vector, so that the situation that the event cannot be accurately extracted due to the fact that clear trigger words are difficult to identify from the text to be processed is avoided, the action of the trigger words is effectively realized by using the event prompt vector, and the accuracy of event extraction is improved.
As a first alternative embodiment of the above step S110, when the neural network model is used to obtain the event Prompt vector, the event Prompt vector may be obtained using a Hard Prompt (Hard Prompt) method, where the Hard Prompt (Hard Prompt) method refers to obtaining the event Prompt vector using a manually designed or constructed Prompt template, and this embodiment may include:
step S111: and obtaining a pre-constructed category label description statement, a category label interpretation definition statement, an element role name and an event keyword.
Step S112: and inputting the category label description statement, the category label interpretation definition statement, the element role name and the event key word into the neural network model to obtain a sentence representation vector output by the neural network model.
The neural network model in the embodiment of the application can adopt a BERT model, a RoBERTa model, a UniLM model XLNET model, a GloVe model, a GPT model, a BART model, an ELMo model, a Sentence BERT model and other pre-training language models, and can also adopt an RNN model, a CNN model and the like or a pre-training language model based on a transducer structure.
Please refer to fig. 3, which illustrates a network structure diagram of a pre-training language model of a transducer class according to an embodiment of the present application; the above-mentioned Transformer class of neural network models may include: a plurality of transformation modules (transformation-Block), an Embedding Layer (Embedding Layer) and an Output Layer (Output Layer), wherein the connection modes of the three are shown in figure 3; each transformation module (transformation-Block) may include a plurality of encoders and a plurality of decoders, and a connection manner between the plurality of encoders and the plurality of decoders is shown in fig. 3, which is not described herein.
Step S113: an event hint vector is determined from a plurality of token embedded vectors in the sentence representation vector.
The above-mentioned embodiment of step S113 is various, including but not limited to the following:
in a first embodiment, after obtaining a plurality of token embedded vectors in sentence representation vectors output by a neural network model, determining a category token embedded vector selected from the plurality of token embedded vectors as an event prompt vector, including: analyzing a category label description sentence, a category label interpretation definition sentence, an element role name and an event key word from a prompt template, then splicing the [ CLS ] "label, the category label description sentence, the category label interpretation definition sentence, the element role name, the event key word and the [ SEP ]" label to obtain a spliced character string, and then inputting the spliced character string into a neural network model to obtain a sentence representation vector output by the neural network model. Since the sentence representation vector includes a plurality of token embedded vectors, the class token embedded vector (i.e., "[ CLS ]" token corresponding vector) corresponding to the event class token (i.e., "[ CLS ]" token) can be selected from the plurality of token embedded vectors in the sentence representation vector, and the class token embedded vector can be determined as the event prompt vector. The prompting template may include: category label description statements, category label interpretation definition statements, element role names, and/or event keywords.
A second embodiment, obtaining an event hint vector through a max-pooling process, includes: and carrying out maximum pooling (max-pooling) processing on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector. Specific examples are: a maximum pooling (max-pooling) process may be performed on a plurality of token embedded vectors in the sentence representation vector to obtain a maximum token embedded vector, and the maximum token embedded vector may be determined as an event hint vector.
In a third embodiment, obtaining an event hint vector through a mean-pooling (mean-pooling) process includes: and carrying out mean-pooling (mean-pooling) on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector. Specific examples are: a mean-pooling (mean-pooling) process may be performed on a plurality of token embedded vectors in the sentence representation vector, thereby calculating a mean token embedded vector, and determining the mean token embedded vector as an event hint vector.
A fourth embodiment obtains an event hint vector through a min-pooling process, comprising: and performing minimum pooling (min-pooling) on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector. Specific examples are: a minimum pooling (min-pooling) process may be performed on a plurality of token embedded vectors in the sentence representation vector to obtain a minimum token embedded vector, and the minimum token embedded vector may be determined as an event hint vector.
As a second alternative embodiment of the above step S110, when the event Prompt vector is acquired using the neural network model, the event Prompt vector may be acquired using a Soft Prompt (Soft Prompt) method, where the Soft Prompt (Soft Prompt) method refers to that the event Prompt vector is acquired through a learning embedding (embedding) vector, and this embodiment may include:
step S114: an event matrix is obtained, wherein the event matrix is a matrix structure constructed according to the total number of event categories and a plurality of token embedded vectors of each event category, and the matrix structure is obtained by learning by using a neural network model.
It will be appreciated that each set of token embedded vectors may characterize an event category, where each set of token embedded vectors includes a plurality of token embedded vectors, and each token embedded vector is a vector of a plurality of tokens (tokens), such as a token embedded vector of K tokens.
The embodiment of step S114 described above is, for example: based on the index number (i.e., index) of the event type, assuming that there are K tokens (token) per event category, an event matrix constructed based on the total number of index numbers and K tokens can be obtained, such as the first event matrix being the first K tokens, the second event matrix being the token between K+1 and 2K, and so on. Then, training and learning are carried out on the event matrix by using a neural network model to obtain the event matrix. Thus, the event matrix is a matrix structure constructed from the total number of event categories and a plurality of token embedded vectors for each event category, and is obtained by learning the matrix structure using a neural network model.
Step S115: for each event category, an event hint vector is determined from a plurality of token embedded vectors in an event matrix.
It will be appreciated that the event matrix described above may include: a plurality of tokens are embedded into the vector.
The above-described embodiment of step S115 is various, including but not limited to the following:
in a first embodiment, the maximum token embedded vector is determined as an event hint vector, specifically for example: and carrying out maximum pooling (max-pooling) processing on a plurality of token embedded vectors in the event matrix to obtain a maximum token embedded vector, and determining the maximum token embedded vector as an event prompt vector.
In a second embodiment, the mean token embedded vector is determined as an event hint vector, specifically for example: and carrying out mean-pooling (mean-pooling) on a plurality of token embedded vectors in the event matrix to obtain a mean token embedded vector, and determining the mean token embedded vector as an event prompt vector.
In a third embodiment, the minimum token embedded vector is determined as an event hint vector, specifically for example: and carrying out minimum pooling (min-pooling) processing on a plurality of token embedded vectors in the event matrix to obtain a minimum token embedded vector, and determining the minimum token embedded vector as an event prompt vector.
As an alternative embodiment of the above step S120, when classifying the text to be processed according to the event prompt vector, the event classification may be determined according to a similarity value between the text expression vector and the event prompt vector, and the embodiment may include:
step S121: and acquiring a text representation vector of the text to be processed and event categories of a plurality of groups of token embedded vectors in the event prompt vector.
The embodiment of step S121 described above is, for example: and inputting the artificially designed or constructed prompting template and the text to be processed into the neural network model, and then obtaining a text representation vector output by the neural network model, wherein the prompting template is artificially designed or constructed, and when the prompting template is designed or constructed, the event category corresponding to the event prompting vector is already known, and at the moment, the event category of a plurality of groups of token embedded vectors in the event prompting vector can be directly obtained. In a specific practical process, one prompting template corresponds to one event category, a plurality of prompting templates can be designed or constructed artificially, and the plurality of prompting templates can correspondingly represent a plurality of event categories. Or inputting the text to be processed into the neural network model, then obtaining a text representation vector output by the neural network model, obtaining a class embedding vector (embedding) of the [ cls ] label from the text representation vector output by the model, and determining the event class of the event prompt vector according to the class embedding vector.
Step S122: and judging whether the similarity value between the text expression vector and each group of token embedded vectors is larger than a preset threshold value or not according to each group of token embedded vectors in the plurality of groups of token embedded vectors.
The embodiment of step S122 described above is, for example: for each of the sets of token embedded vectors, assuming that the text representation vector is denoted as a and the event hint vector is denoted as b, then the formula may be usedCalculating a similarity value between the text representing vector and a set of token embedded vectors in the event prompt vector, and judging whether the similarity value between the text representing vector and the set of token embedded vectors in the event prompt vector is larger than a preset threshold value. Where a is a text representation vector, b is a set of token embedded vectors in an event hint vector, and cos_similarity (a, b) represents a similarity value between the text representation vector and the set of token embedded vectors in the event hint vector.
Step S123: and if the similarity value between the text representation vector and a group of token embedded vectors in the event prompt vector is larger than a preset threshold value, determining the event category of the text to be processed as the event category of the group of token embedded vectors.
The embodiment of step S122 to step S123 described above is, for example: judging whether the similarity value between the text representation vector and a group of token embedded vectors in the event prompt vector is larger than a preset threshold value or not by using an executable program compiled or interpreted by a preset programming language; if the similarity value between the text representation vector and a group of token embedded vectors in the event prompt vector is larger than a preset threshold value, determining the event category of the text to be processed as the event category of the group of token embedded vectors by using an executable program. Among these, programming languages that can be used are, for example: C. c++, java, BASIC, javaScript, LISP, shell, perl, ruby, python, PHP, etc.
In the implementation process, the event category is determined according to the similarity value between the text representation vector and the event prompt vector predicted by the model, so that a neural network model is used for completing various tasks (such as an event detection task, an event classification task and the like), the complicated process of retraining and fine-tuning (fine-tuning) of each specific task is improved, and the efficiency of event extraction by using the neural network model is improved.
As an alternative implementation of the above step S130, when extracting all entity elements (i.e., entity extraction or element extraction) from the text to be processed, the entity extraction may be performed using the BiLSTM model+crf layer, or using the LSTM model+crf layer, which may include:
A first embodiment, for example, using the BiLSTM model+crf layer for entity extraction: and carrying out representation vectorization on the text to be processed by using a BiLSTM model to obtain text representation vectors, and carrying out named entity recognition on the text to be processed by using a conditional random field (Conditional Random Forest, CRF) layer to obtain all entity elements.
A second embodiment, for example, using LSTM model+crf layer for entity extraction: and carrying out representation vectorization on the text to be processed by using an LSTM model to obtain text representation vectors, and carrying out named entity recognition on the text to be processed by using a conditional random field CRF layer to obtain all entity elements. Assume that the specific content of the text to be processed may be "10/16 in 2016, HG company has contracted for a contract. The contract specifies that the HG company mortises 8 tens of thousands of strands to the WL company ", then all the physical elements extracted from the text to be processed may include: HG company, WL company, 8 ten thousand stocks, and "10 months 16 days in 2016".
As an alternative implementation manner of the above step S140, when the entity element corresponding to the event category is identified from all the entity elements according to the event prompt vector, the entity element may be determined according to the similarity value between the element vector and the event prompt vector, which may include:
Step S141: and extracting all the entity elements from each text sentence of the text to be processed, and converting all the entity elements into a plurality of element vectors.
The embodiment of step S141 is, for example: assume that the specific content of the text to be processed may be "10/16 in 2016, HG company has contracted for a contract. The contract specifies that HG company gives its 8 tens of thousands of mortgages to WL company ", then all the entity elements extracted from each text sentence of the text to be processed may include: HG company, WL company, 8 ten thousand stocks, and "10 months 16 days in 2016". Then, HG company, WL company, 8 tens of thousands of strands, and "10 months of 2016" 16 days "are respectively input into the neural network model, thereby obtaining a plurality of element vectors respectively output by the neural network model.
Step S142: for each element vector of the plurality of element vectors, determining whether a similarity value between the element vector and the event prompt vector is greater than a similarity threshold.
Step S143: and if the similarity value between the element vector and the event prompt vector is larger than the similarity threshold value, determining the entity element corresponding to the element vector as the entity element corresponding to the event category.
The embodiment of the above steps S142 to S143 is, for example: because the event prompt vector can not only represent event information, but also represent entity element information in a specific event, assuming that the event prompt vector can represent event category information of "mortgage", it is easy to infer that a similarity value between the event prompt vector and an element vector corresponding to "mortgage person" is greater than a similarity threshold, and similarly, a similarity value between the event prompt vector and an element vector corresponding to "quality weight share" is greater than a similarity threshold. Therefore, by performing the combination extraction (combination extraction) according to the similarity value between the element vector and the event prompt vector, the accuracy of determining the entity element corresponding to the element vector as the entity element corresponding to the event category can be effectively improved, and the accuracy of event detection can be improved.
As an alternative implementation manner of the above step S150, when performing element role recognition on the entity element corresponding to the event category, a Feed-forward neural (Feed-Forward Neural network, FFN) network may be used to perform element role recognition, and this implementation manner may include:
step S151: and carrying out element role recognition on the entity elements corresponding to the event categories by using the feedforward nerve FFN network to obtain the role categories of the entity elements.
The embodiment of step S151 described above is, for example: and performing element role recognition on the entity elements corresponding to the event types by using a feedforward neural FFN network, wherein the feedforward neural network is an artificial neural network, the connection among all units of the feedforward neural network does not form a loop, and the feedforward neural network is an artificial neural network different from a recurrent neural network. It will be appreciated that the role categories of the entity element may be used to generate an event record table (generate event records).
Please refer to fig. 4, which is a schematic flow chart of a training neural network model according to an embodiment of the present application; as an alternative embodiment of the above event extraction method, before the neural network model is used to obtain the event prompt vector, the training of the neural network model may further include:
Step S210: acquiring sample text and a sample label, wherein the sample label comprises: event category labels of sample texts, entity element labels, event entity relation labels and entity role labels, wherein the event category labels are event category sets included in the sample texts, the entity element labels are entity element sets extracted from sample sentences in the sample texts, the event entity relation labels are corresponding relation sets between entity elements and events, and the entity role labels are role categories of the entity elements in the events.
The event category label is a set of event categories that the sample text includes, and assuming that the specific content of the sample text may be "HG company has signed a contract that specifies that HG company has 8 ten thousand copies of it to WL company", the set of event categories of the sample text is "mortgage". Similarly, assuming that the specific content of the sample text may be "HG corporation has its 8 ten thousand shares mortgage to WL corporation, and stakeholders have cut the shares of HG corporation," then the event category set for the sample text includes: "mortgage" and "hold-down".
The entity element tag is a set of entity elements included in the sample text, and assuming that the specific content of the sample text may be "HG company has signed a contract, the HG company has a mortgage of 8 ten thousand of the HG company to WL company", the entity element tag of the sample text includes: HG company, WL company and 8 ten thousand strands.
The event entity relationship label is a corresponding relationship set between the entity element and the event, and supposing that the specific content of the sample text can be that the HG company gives 8 ten thousand copies of the sample text to the WL company, and the stockholder subtracts the shares of the HG company, the entity role label of the entity element 'WL company' is a 'prime right person', and the event entity relationship label of the entity element 'WL company' is that the 'WL company' belongs to a mortgage event.
The entity role tag is the role category of the entity element in the event, and assuming that the specific content of the sample text can be that the HG company gives 8 ten thousand copies of the sample text to the WL company, and the stakeholder subtracts the shares of the HG company, the entity role tag of the entity element 'WL company' is a 'quality right person'.
Step S220: and training the neural network model by taking the sample text as training data and taking event category labels, entity element labels, event entity relation labels and entity role labels of the sample text as training labels to obtain the neural network model.
As an alternative embodiment of the above step S220, when training the neural network model, the neural network model may be trained according to four loss values at the same time, and the embodiment may include:
Step S221: and acquiring event prompt vectors by using a neural network model, pooling all token embedded vectors in the sample text to acquire sample expression vectors, calculating text similarity between the event prompt vectors and the sample expression vectors, and determining a first loss value according to event category labels and the text similarity.
It is to be understood that the above embodiment of obtaining the event prompt vector using the neural network model is similar to the embodiment of step S110, and thus will not be described herein.
The embodiment of step S221 described above is, for example: assuming that there are multiple event category labels, such a calculation is performed for each event category label in the multiple event category labels, the sample text is input to the neural network model first, and then the overall token embedded vector for the neural network model to output the sample text is obtained. Further, the entire token embedded vector of the sample text is pooled (e.g., max-pooled, mean-pooled, or min-pooled, etc.) to obtain a sample representation vector. Then, using the formulaThe text similarity between the event prompt vector and the sample expression vector is calculated, wherein a is the sample expression vector, b is the event prompt vector, and cos_similarity (a, b) represents the text similarity between the event prompt vector and the sample expression vector.
After obtaining the text similarity between the event prompt vector and the sample representation vector, loss1= -y may also be used b ×log(sigmoid(cos_similarity(a,b))+(1-y b ) log (1-sigmoid (cos_similarity (a, b))) to calculate the event category label and the text similarity, and obtaining a calculated first loss value; where loss1 represents a first loss value calculated by the loss function, yb represents whether the sample event category is the same as the event category label (0 represents non-identical, 1 represents identical). It will be appreciated that in the process of calculating the first loss value by the loss1 function, if the sample event type is the same as the event type label, the event prompt vector may be adjusted so that the similarity cos_similarity (a, b) between the event prompt vector and the text representation vector becomes larger; similarly, if the sample event category is different from the event category label, the model may adjust the event prompt vector such that the similarity cos_similarity (a, b) between the event prompt vector and the text representation vector is reduced, thereby enabling the model to label the text to be processed with multiple event category labels (i.e., multi-labeled textThis classification).
Step S222: and extracting a sample entity element from each sample sentence of the sample text by using the neural network model, and calculating a second loss value between the sample entity element of the sample sentence and the entity element label of the sample sentence.
The embodiment of step S222 described above is, for example: the second loss value refers to a loss value between the sample entity element and the entity element label, and the entity element is as follows: g company, WL company, 8 ten thousand and "10 month 16 day in 2016" in the text to be processed, and the like. A conditional random field (Conditional Random Field, CRF) can be used to obtain the entity element labels of the sample text and a neural network can be used to extract sample entity elements from each sample statement of the sample text, then the formula can be usedA second penalty value between the sample entity element of the sample statement and the entity element tag of the sample statement is calculated. Where loss2 represents a second loss value between a sample entity element of the sample sentence and an entity element tag of the sample sentence, p (x) represents an entity element tag of the sample sentence, and q (x) represents that the neural network extracts the sample entity element from the sample sentence.
Step S223: pooling entity element tokens in the sample text to obtain entity representation vectors, calculating vector similarity between event prompt vectors corresponding to the sample text and the entity representation vectors, and calculating the vector similarity according to event entity relation labels to obtain a third loss value.
The embodiment of step S223 described above is, for example: after pooling the entity element tokens in the sample text to obtain the entity representation vector, the output layer of the neural network may also be set to two different fully connected layers, where the two different fully connected layers may include: a first fully-connected layer and a second fully-connected layer; wherein the first full connection layer is used for using the formulaCalculating an entity representation vector obtained after the sample entity element token is pooled, wherein a second full-connection layer is used for using the formula +.>Calculating event prompt vectors corresponding to the prompt templates, wherein +.>Representing the calculated entity representation vector of the first full connection layer of the neural network, e j Representing an entity representation vector after entity element token pooling,/->b e Respectively representing the weight and the bias parameter corresponding to the entity representation vector, e i Event prompt vector corresponding to the prompt template is represented, +.>Event prompt vector representing output after calculation of second full connection layer of input neural network, +.>b s Respectively representing the weight and the bias parameter corresponding to the event prompt vector.
Then, the formula can also be usedCalculating a semantic distance similarity matrix between the entity representation vector and an event prompt vector corresponding to the prompt template; wherein (1) >Representing event prompt vectors corresponding to prompt templates output by the neural network model after the prompt templates are input into the second full-connection layer of the neural network for calculation>Representing a calculated entity representation vector of a first full connection layer of the neural network, d h An embedding (embedding) dimension representing an entity representation vector, or an embedding (embedding) dimension of an event hint vector, because the embedding (embedding) dimension of an entity representation vector and the embedding (embedding) dimension of an event hint vector are the same, therefore, d h Can represent the dimension of the entity representation vector as well as the dimension of the event prompt vector, +.>Representing a semantic distance similarity matrix between the entity representation vector and the event hint vector.
Finally, using the formulaCalculating a third loss value between any two sample entity elements in the same event; wherein loss3 represents a third loss value between any two sample entity elements within the same event, < ->Representing a semantic distance similarity matrix between the entity representation vector and the event prompt vector,/a>And (3) representing whether the event category corresponding to the sample entity element is the same as the event category label of the sample statement or not (0 represents different and 1 represents the same) in the event entity relation label.
Step S224: and extracting the sample role category of the sample entity element by using the neural network model, and calculating a fourth loss value between the sample role category and the entity role label of the sample text.
The embodiment of step S224 described above is, for example: extracting sample role classes of sample entity elements using a feed forward neural network (FFN) in a neural network, the prediction process of which can be formulated asWherein ε k Representing sample entity elements, FFN representing the processing of the feed-forward neural network, sigmoid representing the calculation of the sigmoid function (also called Logistic function), and +.>A sample role category representing a sample entity element.
After obtaining the sample role category of the sample entity element, the formula can be usedCalculating a fourth loss value between the sample character class and the entity character label of the sample text; wherein loss4 represents a fourth penalty value between the sample character category and the entity character label of the sample text, < ->Sample role class representing sample entity element, +.>It is indicated whether the sample character class is identical to the character class corresponding to the entity character tag of the sample text (0 indicates non-identical, 1 indicates identical).
Step S225: training the neural network model according to the first loss value, the second loss value, the third loss value and/or the fourth loss value to obtain the neural network model.
The embodiment of step S225 described above is, for example: it will be appreciated that the first, second, third and/or fourth Loss values may also be calculated using the formula Loss = a x Loss1+ b x Loss2+ c x Loss3+ d x Loss4, thereby obtaining a total Loss value; where loss1 represents a first loss value between the event class of the sample event and the event class label of the sample text, loss2 represents a second loss value between the sample entity element of the sample sentence and the entity element label of the sample sentence, loss3 represents a third loss value between any two sample entity elements within the same event, loss4 represents a fourth loss value between the sample role class and the entity role label of the sample text, and a, b, c, d is an adjustable hyper-parameter.
After the total loss value is obtained, updating the network weight parameter of the neural network model according to the total loss value until the accuracy of the neural network model is not increased or the number of iteration times (epoch) is larger than a preset threshold value, so that the trained neural network model can be obtained. The preset threshold may be set according to the specific situation, for example, set to 100 or 1000, etc.
As an alternative embodiment of the event extraction method, the neural network model may include: BERT model, roberta model, uniLM model, XLNET model, RNN model, CNN model, or a model based on a Transformer structure.
Please refer to fig. 5, which illustrates a schematic structure diagram of an event extraction device according to an embodiment of the present application; the embodiment of the application provides an event extraction device 300, which comprises:
a hint vector extraction module 310, configured to obtain event hint vectors using a neural network model, where the event hint vectors include multiple sets of token embedding vectors, each set of token embedding vectors characterizing an event category, and each set of token embedding vectors includes multiple token embedding vectors.
The event classification obtaining module 320 is configured to obtain a text to be processed, and classify the text to be processed according to the event prompt vector to obtain an event classification.
The entity element extraction module 330 is configured to extract all entity elements from the text to be processed.
The entity element identification module 340 is configured to identify, from all entity elements, entity elements corresponding to the event category according to the event prompt vector.
The role category obtaining module 350 is configured to identify an element role for an entity element corresponding to the event category, obtain a role category of the entity element, and use the role category of the entity element to generate an event record table.
Optionally, in an embodiment of the present application, the hint vector prediction module includes:
the construction data acquisition sub-module is used for acquiring a pre-constructed category label description sentence, a category label interpretation definition sentence, an element role name and an event key word;
the expression vector obtaining sub-module is used for inputting the category label description statement, the category label interpretation definition statement, the element role name and the event key word into the neural network model to obtain a sentence expression vector output by the neural network model;
a first vector determination submodule for determining an event hint vector according to a plurality of token embedded vectors in the sentence representation vector.
Optionally, in an embodiment of the present application, the first vector determination submodule includes:
a first vector determination unit, configured to screen out a category token embedded vector from a plurality of token embedded vectors in the sentence representation vector, and determine the category token embedded vector as an event prompt vector;
or, the second vector determining unit is used for carrying out maximum pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector;
or, the third vector determining unit is used for carrying out mean value pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain an event prompt vector;
Or the fourth vector determining unit is used for carrying out minimum pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain the event prompt vector.
Optionally, in an embodiment of the present application, the hint vector prediction module includes:
the event matrix acquisition sub-module is used for acquiring an event matrix, wherein the event matrix is a matrix structure constructed according to the total number of event categories and a plurality of token embedded vectors of each event category, and the event matrix is obtained by learning the matrix structure by using a neural network model;
and the second vector determination sub-module is used for determining an event prompt vector according to a plurality of token embedded vectors in the event matrix for each event category.
Optionally, in an embodiment of the present application, the event matrix includes: a plurality of token embedded vectors; a second vector determination submodule comprising:
the fifth vector determining unit is used for carrying out maximum pooling processing on a plurality of token embedded vectors in the event matrix to obtain event prompt vectors;
or, the sixth vector determining unit is used for carrying out mean value pooling processing on the token embedded vectors in the event matrix to obtain an event prompt vector;
or the seventh vector determining unit is used for carrying out minimum pooling processing on the token embedded vectors in the event matrix to obtain the event prompt vector.
Optionally, in an embodiment of the present application, the event classification obtaining module includes:
the event category obtaining sub-module is used for obtaining text expression vectors of the text to be processed and event categories of a plurality of groups of token embedding vectors in the event prompt vectors.
And the text vector judging sub-module is used for judging whether the similarity value between the text expression vector and each group of token embedded vectors is larger than a preset threshold value or not according to each group of token embedded vectors in the plurality of groups of token embedded vectors.
And the event category determination submodule is used for determining the event category of the text to be processed as the event category of the embedded vector of the group of tokens if the text representation vector and the event prompt vector are larger than a preset threshold value.
Optionally, in an embodiment of the present application, the entity element identification module includes:
and the element extraction and conversion sub-module is used for extracting all entity elements of each text sentence from the text to be processed and converting all entity elements into a plurality of element vectors.
And the similarity value judging sub-module is used for judging whether the similarity value between each element vector in the plurality of element vectors and a group of token embedded vectors in the event prompt vector is larger than a similarity threshold value or not.
And the entity element determining sub-module is used for determining the entity element corresponding to the element vector as the entity element corresponding to the event category of the token embedded vector if the similarity value between the element vector and the event prompt vector is larger than the similarity threshold value.
Optionally, in an embodiment of the present application, the role category obtaining module includes:
and the element role identification sub-module is used for carrying out element role identification on the entity elements corresponding to the event categories by using the feedforward neural FFN network.
Optionally, in an embodiment of the present application, the event extraction device further includes:
the text label acquisition module is used for acquiring sample text and sample labels, and the sample labels comprise: event category labels of sample texts, entity element labels, event entity relation labels and entity role labels, wherein the event category labels are event category sets included in the sample texts, the entity element labels are entity element sets extracted from sample sentences of the sample texts, the event entity relation labels are corresponding relation sets between entity elements and events, and the entity role labels are role categories of the entity elements in the events.
The neural network training module is used for training the neural network model by taking the sample text as training data and taking the event category label, the entity element label, the event entity relation label and the entity role label of the sample text as training labels to obtain the neural network model.
Optionally, in an embodiment of the present application, the neural network training module includes:
the first loss calculation sub-module is used for obtaining event prompt vectors by using the neural network model, pooling all token embedded vectors in the sample text to obtain sample expression vectors, calculating text similarity between the event prompt vectors and the sample expression vectors, and determining a first loss value according to event category labels and the text similarity.
And the second loss calculation sub-module is used for extracting sample entity elements from each sample sentence of the sample text by using the neural network model, and calculating a second loss value between the sample entity elements of the sample sentence and the entity element labels of the sample sentence.
And the third loss calculation sub-module is used for pooling entity element tokens in the sample text to obtain entity representation vectors, calculating the vector similarity between the event prompt vectors corresponding to the sample text and the entity representation vectors, and determining a third loss value according to the event entity relation labels and the vector similarity.
And the fourth loss calculation sub-module is used for extracting the sample role category of the sample entity element by using the neural network model and calculating a fourth loss value between the sample role category and the entity role label of the sample text.
The neural network training sub-module is used for training the neural network model according to the first loss value, the second loss value, the third loss value and/or the fourth loss value.
Optionally, in an embodiment of the present application, the neural network model includes: BERT model, roberta model, uniLM model, XLNET model, RNN model, CNN model, or a model based on a Transformer structure.
It should be understood that, corresponding to the above-mentioned event extraction method embodiment, the apparatus can perform the steps related to the above-mentioned method embodiment, and specific functions of the apparatus may be referred to the above description, and detailed descriptions are omitted herein as appropriate to avoid repetition. The device includes at least one software functional module that can be stored in memory in the form of software or firmware (firmware) or cured in an Operating System (OS) of the device.
Please refer to fig. 6, which illustrates a schematic structural diagram of an electronic device according to an embodiment of the present application. An electronic device 400 provided in an embodiment of the present application includes: a processor 410 and a memory 420, the memory 420 storing machine-readable instructions executable by the processor 410, which when executed by the processor 410 perform the method as described above.
The embodiment of the present application also provides a computer readable storage medium 430, on which computer readable storage medium 430 a computer program is stored which, when executed by the processor 410, performs a method as above.
The computer-readable storage medium 430 may be implemented by any type or combination of volatile or nonvolatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
In the embodiments of the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
In addition, the functional modules of the embodiments of the present application may be integrated together to form a single part, or the modules may exist separately, or two or more modules may be integrated to form a single part. Furthermore, in the description herein, the descriptions of the terms "one embodiment," "some embodiments," "examples," "specific examples," "some examples," and the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
The foregoing description is merely an optional implementation of the embodiment of the present application, but the scope of the embodiment of the present application is not limited thereto, and any person skilled in the art may easily think about changes or substitutions within the technical scope of the embodiment of the present application, and the changes or substitutions are covered by the scope of the embodiment of the present application.

Claims (10)

1. An event extraction method, comprising:
acquiring event prompt vectors by using a neural network model, wherein the event prompt vectors comprise a plurality of groups of token embedded vectors, each group of token embedded vectors represents an event category, and each group of token embedded vectors comprises a plurality of token embedded vectors;
acquiring a text to be processed, and carrying out event classification on the text to be processed according to the event prompt vector to obtain an event category;
extracting all entity elements from the text to be processed;
identifying entity elements corresponding to the event category from all the entity elements according to the event prompt vector;
and carrying out element role recognition on the entity element corresponding to the event category to obtain the role category of the entity element, wherein the role category of the entity element is used for generating an event record table.
2. The method of claim 1, wherein the obtaining an event hint vector using a neural network model comprises:
acquiring a pre-constructed category label description statement, a category label interpretation definition statement, an element role name and an event key word;
inputting the category label description statement, the category label interpretation definition statement, the element role name and the event key word into the neural network model to obtain a sentence representation vector output by the neural network model;
the event hint vector is determined from a plurality of token embedded vectors in the sentence representation vector.
3. The method of claim 2, wherein the determining the event hint vector from a plurality of token embedded vectors in the sentence representation vector comprises:
screening out category token embedded vectors from a plurality of token embedded vectors in the sentence representation vector, and determining the category token embedded vectors as the event prompt vectors;
or, carrying out maximum pooling treatment on a plurality of token embedded vectors in the sentence representation vector to obtain the event prompt vector;
or, carrying out average pooling processing on a plurality of token embedded vectors in the sentence representation vector to obtain the event prompt vector;
Or, performing a minimum pooling process on a plurality of token embedded vectors in the sentence representation vector to obtain the event prompt vector.
4. The method of claim 1, wherein the obtaining an event hint vector using a neural network model comprises:
acquiring an event matrix, wherein the event matrix is a matrix structure constructed according to the total number of event categories and a plurality of token embedded vectors of each event category, and learning the matrix structure by using the neural network model;
for each of the event categories, determining the event hint vector from a plurality of token embedded vectors in the event matrix.
5. The method of claim 4, wherein the event matrix comprises: a plurality of token embedded vectors; the determining the event hint vector from a plurality of token embedded vectors in the event matrix includes:
carrying out maximum pooling treatment on a plurality of token embedded vectors in the event matrix to obtain the event prompt vector;
or, carrying out mean value pooling treatment on a plurality of token embedded vectors in the event matrix to obtain the event prompt vector;
Or, performing minimum pooling processing on a plurality of token embedded vectors in the event matrix to obtain the event prompt vector.
6. The method of claim 1, wherein the event classifying the text to be processed according to the event prompt vector comprises:
acquiring a text representation vector of the text to be processed and event categories of a plurality of groups of token embedding vectors in the event prompt vector;
for each group of token embedded vectors in the plurality of groups of token embedded vectors, judging whether the similarity value between the text expression vector and the group of token embedded vectors is larger than a preset threshold value;
if yes, determining the event category of the text to be processed as the event category of the set of token embedded vectors.
7. The method according to claim 1, wherein the identifying, from the total entity elements according to the event hint vector, the entity element corresponding to the event category includes:
extracting all entity elements from each text sentence of the text to be processed, and converting all entity elements into a plurality of element vectors;
for each element vector of the plurality of element vectors, determining whether a similarity value between the element vector and a set of token embedded vectors of the event hint vector is greater than a similarity threshold;
If yes, determining the entity element corresponding to the element vector as the entity element corresponding to the event category of the token embedded vector.
8. The method of any of claims 1-7, further comprising, prior to said obtaining an event prompt vector using a neural network model:
obtaining sample text and a sample tag, the sample tag comprising: the event type label is an event type set included in the sample text, the entity element label is an entity element set extracted from a sample sentence of the sample text, the event entity relationship label is a corresponding relationship set between the entity element and an event, and the entity role label is a role type of the entity element in the event;
and training the neural network model by taking the sample text as training data and taking an event category label, an entity element label, an event entity relation label and an entity role label of the sample text as training labels to obtain the neural network model.
9. The method of claim 8, wherein training the neural network model comprises:
Acquiring the event prompt vector by using the neural network model, pooling all token embedded vectors in the sample text to obtain a sample representation vector, calculating text similarity between the event prompt vector and the sample representation vector, and determining a first loss value according to the event category label and the text similarity;
extracting sample entity elements from each sample sentence of the sample text by using the neural network model, and calculating a second loss value between the sample entity elements of the sample sentence and entity element labels of the sample sentence;
pooling entity element tokens in the sample text to obtain entity representation vectors, calculating vector similarity between event prompt vectors corresponding to the sample text and the entity representation vectors, and determining a third loss value according to the event entity relationship labels and the vector similarity;
extracting a sample role category of the sample entity element by using the neural network model, and calculating a fourth loss value between the sample role category and an entity role label of the sample text;
training the neural network model according to the first loss value, the second loss value, the third loss value and/or the fourth loss value.
10. An event extraction device, comprising:
the prompt vector extraction module is used for acquiring event prompt vectors by using the neural network model;
the event classification obtaining module is used for obtaining a text to be processed, and carrying out event classification on the text to be processed according to the event prompt vector to obtain an event class;
the entity element extraction module is used for extracting all entity elements from the text to be processed;
the entity element identification module is used for identifying entity elements corresponding to the event category from all the entity elements according to the event prompt vector;
the role category obtaining module is used for carrying out element role recognition on the entity elements corresponding to the event categories to obtain the role categories of the entity elements, wherein the role categories of the entity elements are used for generating an event record table.
CN202310619154.8A 2023-05-25 2023-05-25 Event extraction method and device and electronic equipment Pending CN116702765A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310619154.8A CN116702765A (en) 2023-05-25 2023-05-25 Event extraction method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310619154.8A CN116702765A (en) 2023-05-25 2023-05-25 Event extraction method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN116702765A true CN116702765A (en) 2023-09-05

Family

ID=87828583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310619154.8A Pending CN116702765A (en) 2023-05-25 2023-05-25 Event extraction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN116702765A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117131935A (en) * 2023-10-25 2023-11-28 浙商期货有限公司 Knowledge graph construction method oriented to futures field

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117131935A (en) * 2023-10-25 2023-11-28 浙商期货有限公司 Knowledge graph construction method oriented to futures field

Similar Documents

Publication Publication Date Title
CN110135457B (en) Event trigger word extraction method and system based on self-encoder fusion document information
CN109992782B (en) Legal document named entity identification method and device and computer equipment
US11941366B2 (en) Context-based multi-turn dialogue method and storage medium
CN109992664B (en) Dispute focus label classification method and device, computer equipment and storage medium
CN112270379A (en) Training method of classification model, sample classification method, device and equipment
US20210174193A1 (en) Slot filling with contextual information
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
CN112817561B (en) Transaction type functional point structured extraction method and system for software demand document
CN111930914A (en) Question generation method and device, electronic equipment and computer-readable storage medium
CN115599901B (en) Machine question-answering method, device, equipment and storage medium based on semantic prompt
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN110399472B (en) Interview question prompting method and device, computer equipment and storage medium
CN112818698B (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN112667782A (en) Text classification method, device, equipment and storage medium
CN116775872A (en) Text processing method and device, electronic equipment and storage medium
CN113742733A (en) Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device
CN116304748A (en) Text similarity calculation method, system, equipment and medium
CN116702765A (en) Event extraction method and device and electronic equipment
CN112364659B (en) Automatic identification method and device for unsupervised semantic representation
CN117828024A (en) Plug-in retrieval method, device, storage medium and equipment
CN115034302B (en) Relation extraction method, device, equipment and medium for optimizing information fusion strategy
CN115357712A (en) Aspect level emotion analysis method and device, electronic equipment and storage medium
CN114580397A (en) Method and system for detecting &lt; 35881 &gt; and cursory comments
CN114239555A (en) Training method of keyword extraction model and related device
CN116628178A (en) Multi-event extraction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination