CN113761122A - Event extraction method, related device, equipment and storage medium - Google Patents

Event extraction method, related device, equipment and storage medium Download PDF

Info

Publication number
CN113761122A
CN113761122A CN202110546916.7A CN202110546916A CN113761122A CN 113761122 A CN113761122 A CN 113761122A CN 202110546916 A CN202110546916 A CN 202110546916A CN 113761122 A CN113761122 A CN 113761122A
Authority
CN
China
Prior art keywords
semantic
graph
event
nodes
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110546916.7A
Other languages
Chinese (zh)
Inventor
李涓子
王子奇
王晓智
韩旭
林衍凯
侯磊
刘知远
李鹏
周杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tsinghua University
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Tencent Technology Shenzhen Co Ltd filed Critical Tsinghua University
Priority to CN202110546916.7A priority Critical patent/CN113761122A/en
Publication of CN113761122A publication Critical patent/CN113761122A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application discloses an event extraction method, a related device, equipment and a storage medium, which are used for converting sentence-level natural language into nodes and edges, and then converting the nodes and the edges into semantic features to extract events, and can ensure the accuracy of event acquisition. The method in the embodiment of the application comprises the following steps: acquiring a text to be processed; generating abstract semantic representations according to the texts to be processed, wherein the abstract semantic representations comprise nodes which are in one-to-one correspondence with the words and edges used for connecting the nodes; carrying out semantic coding processing on the abstract semantics and the text representation to obtain a semantic embedded vector, wherein the semantic embedded vector is used for representing semantic features between each word and an event; carrying out graph coding processing on the abstract semantic representation to obtain a graph embedding vector, wherein the graph embedding vector is used for representing the structural characteristics of nodes connected through edges; splicing the semantic embedded vector and the graph embedded vector to obtain a spliced feature vector; and identifying the spliced feature vectors and outputting a target event.

Description

Event extraction method, related device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of internet, in particular to an event extraction method, a related device, equipment and a storage medium.
Background
The event extraction is to extract events which are interested by a user from unstructured information and present the events to the user in a structured manner, generally by performing event detection on the information and then performing event role extraction on the detected events.
The traditional event extraction method mainly adopts a supervised learning method, generally, a standard corpus is obtained by manually labeling a data set label in advance, an event in a data set is defined in advance, a high-level neural network is trained by using the labeled corpus and the defined event architecture, and then the event is extracted by using the trained high-level neural network.
Although the traditional event extraction method does not depend on the content and format of the corpus, the construction difficulty of the event extraction data set is high and a large-scale standard corpus is needed, otherwise, a more serious data sparsity problem occurs, so that the training of a high-level neural network is limited, and the event extraction effect is poor.
Disclosure of Invention
The embodiment of the application provides an event extraction method, which is used for converting sentence-level natural language into nodes and edges which are convenient to identify, and then converting the nodes and the edges into semantic embedded vectors which can accurately reflect semantic relations among the nodes and graph embedded vectors which can accurately reflect the semantic relations among the nodes connected through the edges, so that the event semantic definition framework is expanded, the event semantic relations have higher accuracy, and the precision of obtaining target events is ensured.
In view of the above, an aspect of the present application provides an event extraction method, including:
acquiring a text to be processed, wherein the text to be processed comprises N words, and N is an integer greater than 1;
generating abstract semantic representations according to the texts to be processed, wherein the abstract semantic representations comprise nodes which correspond to the words one by one and edges used for connecting the nodes;
carrying out semantic coding processing on the abstract semantic representation and the text to be processed to obtain a semantic embedded vector, wherein the semantic embedded vector is used for representing semantic features between each word and each event;
carrying out graph coding processing on the abstract semantic representation to obtain a graph embedding vector, wherein the graph embedding vector is used for representing structural features between nodes connected through edges;
splicing the semantic embedded vector and the graph embedded vector to obtain a spliced feature vector;
and identifying the spliced feature vectors and outputting a target event, wherein the target event comprises a trigger word and a role word extracted from the N words, the trigger word is used for indicating an event occurring in the text to be processed, and the role word is used for indicating roles of all entities in the text to be processed in the event.
Another aspect of the present application provides an event extraction apparatus, including:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a text to be processed, the text to be processed comprises N words, and N is an integer greater than 1;
the generating unit is used for generating abstract semantic representations according to the texts to be processed, wherein the abstract semantic representations comprise nodes which are in one-to-one correspondence with the words and edges which are used for connecting the nodes;
the processing unit is used for carrying out semantic coding processing on the abstract semantic representation and the text to be processed to obtain a semantic embedded vector, wherein the semantic embedded vector is used for representing semantic features between each word and each event;
the processing unit is also used for carrying out graph coding processing on the abstract semantic representation to obtain a graph embedding vector, wherein the graph embedding vector is used for representing structural features between nodes connected through edges;
the processing unit is also used for splicing the semantic embedded vector and the graph embedded vector to obtain a spliced feature vector;
and the identification unit is used for identifying the spliced feature vectors and outputting a target event, wherein the target event comprises a trigger word and a role word extracted from the N words, the trigger word is used for indicating an event occurring in the text to be processed, and the role word is used for indicating roles of all entities in the text to be processed in the event.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the processing unit is also used for carrying out node coding processing on the text to be extracted to obtain a node coding vector, and the node coding vector is used for initializing semantic features between each word and event;
and the processing unit is also used for carrying out graph coding processing on the node coding vector and the abstract semantic representation through a graph coding model to obtain a graph embedding vector.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the determining unit is used for determining the maximum pooling feature vector of the trigger word and the maximum pooling feature vector of the role word corresponding to the semantic embedded vector according to a dynamic maximum pooling algorithm;
the processing unit is specifically used for splicing the maximum pooling feature vector of the trigger word and the maximum pooling feature vector of the role word with the spliced feature vector to obtain a feature vector to be identified;
and the output unit is used for carrying out classification and identification on the characteristic vectors to be identified to obtain the target event.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the processing unit is specifically used for carrying out spectral clustering on the spliced feature vectors to obtain a node clustering graph, wherein the node clustering graph comprises clustering nodes and clustering edges connecting the clustering nodes;
the determining unit is also used for determining the edge weight values between the clustering nodes according to the distance of the clustering edges;
the processing unit is specifically used for carrying out graph cutting processing on the node cluster graph to obtain K cluster subgraphs, wherein K is an integer larger than 1;
and the output unit is also used for outputting the target event when the edge weight value of each clustering subgraph accords with the preset weight value.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
and the processing unit is also used for coding the abstract semantic representation according to the sequence coding model to obtain a semantic embedded vector, and the sequence coding model is used for carrying out time-sequence coding on the nodes connected through the edges.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the obtaining unit is further used for obtaining the linguistic data to be processed in the database, wherein the linguistic data to be processed comprises M sentences, and M is an integer greater than or equal to 1;
and the generating unit is also used for generating an abstract semantic representation set from the linguistic data to be processed, wherein the abstract semantic representation set comprises a node set corresponding to the sentence and an edge set formed by connecting edges among the nodes.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the generating unit is also used for generating a contrast learning training data set according to the abstract semantic representation set;
and the training unit is used for pre-training the model according to the comparison learning training data set.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the building unit is used for building a semantic positive example according to the edges, the nodes, the edge set and the node set, wherein the semantic positive example is used for expressing semantic affinity between the trigger words and the roles;
the building unit is also used for carrying out node replacement operation on the semantic positive case to obtain a semantic negative case, wherein the semantic negative case represents the semantic distant relation between the trigger word and the role;
and the construction unit is also used for taking the semantic positive example and the semantic negative example as a comparative learning semantic training data set.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the training unit is specifically used for carrying out reverse training on the basic sequence coding model according to the comparison learning semantic training data set and the loss function to obtain the sequence coding model.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the constructing unit is also used for performing random graph sampling twice on each abstract semantic representation in the abstract semantic representation set to obtain 2M random sampling subgraphs, wherein each random sampling subgraph comprises a target node and a sub-node, and each abstract semantic representation corresponds to two random sampling subgraphs;
the construction unit is also used for taking the 2M random sampling subgraphs as a positive graph example, and the positive graph example is used for expressing semantic affinity of the child nodes relative to the target nodes;
the construction unit is also used for carrying out random combination processing on the random sampling subgraphs to obtain X random combination subgraphs, wherein each random combination subgraph comprises two random sampling subgraphs, the two random sampling subgraphs in the random combination subgraphs respectively correspond to different sampling semantic representations, and X is an integer not equal to 2M;
the construction unit is further used for taking the X random combination subgraphs as a negative graph example, and the negative graph example is used for expressing the semantic distancing relation of the sub-nodes relative to the target node;
and the construction unit is also used for training a data set by taking the positive example and the negative example as the comparative learning graph.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the determining unit is also used for taking any node in the abstract semantic representation as a target node and taking nodes except the target node as sub-nodes;
the processing unit is also used for carrying out random walking from the target node to the sub-nodes to obtain a random node set, and the random node set comprises the target node, the sub-nodes and edges formed by moving from the target node to the sub-nodes;
the processing unit is also used for drawing a node subgraph according to the random node set;
and the processing unit is also used for numbering the target nodes and the sub-nodes in the node subgraph to obtain a random subgraph.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the training unit is specifically used for carrying out reverse training on the basic graph coding model according to the comparison learning graph training data set and the loss function to obtain the graph coding model.
Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above-described aspects.
In another aspect of the application, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the network device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the network device to perform the method provided by the above aspects.
According to the technical scheme, the embodiment of the application has the following advantages:
firstly, generating abstract semantic representation according to a text to be processed, converting words into nodes and edges which are convenient to recognize, further carrying out semantic coding processing on the abstract semantic representation and the text to be processed to obtain semantic embedded vectors which can reflect semantic features between each word and an event through semantic features between each node, obtaining graph embedded vectors which can accurately reflect structural relationships between nodes connected through edges through graph coding processing on the abstract semantic representation, and then carrying out event recognition on spliced feature vectors obtained by splicing the semantic embedded vectors and the graph embedded vectors to obtain a target event. By the method, words are converted into nodes and edges which are convenient to recognize, the nodes and the edges are converted into semantic embedded vectors capable of reflecting semantic features between each word and each event, structural features between the nodes connected through the edges are converted into graph embedded vectors capable of reflecting structural features between each word and each event, and the event semantic definition framework is expanded, so that the learned semantic features and structural features between the words and the events have high accuracy, and the accuracy of acquiring the target events is improved.
Drawings
FIG. 1 is a schematic diagram of an architecture of data processing in an embodiment of the present application;
FIG. 2 is a schematic diagram of an embodiment of an event extraction method in the embodiment of the present application;
FIG. 3 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 4 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 5 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 6 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 7 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 8 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 9 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 10 is a schematic diagram of another embodiment of an event extraction method in the embodiment of the present application;
FIG. 11 is a diagram illustrating an abstract semantic representation of an event extraction method according to an embodiment of the present application;
FIG. 12 is a diagram illustrating target events of an event extraction method according to an embodiment of the present application;
FIG. 13 is a schematic diagram illustrating an effect evaluation of the event extraction method according to the embodiment of the present application;
FIG. 14 is a schematic diagram illustrating another effect evaluation of the event extraction method according to the embodiment of the present application;
FIG. 15 is a schematic diagram of an embodiment of an event extraction device in the embodiment of the present application;
FIG. 16 is a schematic diagram of an embodiment of a computer device in the embodiment of the present application.
Detailed Description
The embodiment of the application provides an event extraction method, which is used for converting words into nodes and edges which are convenient to identify, and then converting the nodes and the edges into semantic embedded vectors which can accurately reflect the semantic relationship between every node and graph embedded vectors which can accurately reflect the semantic relationship between nodes connected through the edges, so that the event semantic definition framework is expanded, the event semantic relationship has higher accuracy, and the precision of obtaining a target event is ensured.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that the event extraction method provided by the present application may be applied to a scenario in which information is obtained through natural language, for example, event extraction is performed on news information to obtain key news information; as another example, a knowledge graph is constructed, for example, by event extraction; as another example, event information in a social network is obtained, for example, through event extraction; as another example, for example, in order to extract events in the above-mentioned various scenarios, in order to extract events, a conventional event extraction method mainly includes manually labeling tags of a data set to obtain standard corpora in advance, and defining a framework for the events in the data set to train a high-level neural network in advance, and further extracting events through the trained high-level neural network, which is not only difficult to construct an event extraction data set but also requires a large-scale standard corpora, resulting in limited training of the high-level neural network, and thus poor event extraction effect is caused.
In order to solve the above problems, the present application provides an event extraction method, which is applied to a data processing system shown in fig. 1, please refer to fig. 1, where fig. 1 is a schematic diagram of an architecture of the data processing system in an embodiment of the present application, and as shown in fig. 1, an abstract semantic representation is first generated according to a text to be processed, words are converted into nodes and edges that are convenient to identify, and then semantic encoding processing is performed on the abstract semantic representation to obtain a semantic embedded vector that can accurately reflect a semantic relationship between each node, and a graph embedded vector that can accurately reflect a semantic relationship between nodes connected by edges is obtained by performing graph encoding processing on the abstract semantic representation, and then event identification is performed on a spliced feature vector obtained by splicing the semantic embedded vector and the graph embedded vector to obtain a target event. By the method, words are converted into nodes and edges which are convenient to recognize, and then the nodes and the edges are converted into semantic embedded vectors which can accurately reflect the semantic relation between every two nodes and graph embedded vectors which can accurately reflect the semantic relation between every two nodes through the edges, so that the event semantic relation is expanded, the event semantic relation has high accuracy, and the precision of obtaining the target event is guaranteed.
In order to solve the above problems, the present application proposes an event extraction method, which is generally performed by a server or a terminal device, and accordingly, is applied to an event extraction apparatus generally provided in the server or the terminal device.
Referring to fig. 2, an example of an event extraction method in the present application includes:
in step S101, a to-be-processed text is obtained, where the to-be-processed text includes N words, and N is an integer greater than 1.
In this embodiment, the text to be processed is a natural language instance, and is used for subsequently extracting an event that is interested by the information acquirer from the text to be processed through event extraction, and presenting the event to the information acquirer in a structured manner, which may be embodied as a news item, a billboard, or a sentence selection of an article, and the like, and is not limited specifically herein, for example, a news item "little sunshine information reports a fire occurring in the community today". The news includes a plurality of words, such as "today", "in-community", and "fire", which can be used to represent events in the news information, and further help the news information acquirer to quickly and accurately acquire the key information in the news through the acquired events.
An event is a representation of information, which can be used to represent the objective fact that a specific person or object interacts with a specific place at a specific time, and is generally sentence-level, wherein the event is composed of a trigger word, an event type and a role.
The trigger is a core word indicating occurrence of an event, and is mostly a verb or a noun, or may be a phrase, for example, in the above-mentioned news information "little sunshine information reports a fire occurring in the community today," fire "is a trigger. The event type refers to 8 event types and 33 seed types defined in the data set (ACE2005), wherein most event extractions adopt 33 event types, for example, the event type corresponding to the trigger word "fire" in the above news information is "disaster". The roles comprise an argument and an argument role, wherein the argument is a participant of an event and mainly comprises an entity, a value and time, and for example, "today" and "in community" in the news information can be understood as arguments; and the value is a non-physical event participant, such as a job position. The argument role refers to a role that a participant of an event plays in the event, and 35 types of roles, e.g., an attacker, a victim, etc., are defined in the data set (ACE2005), and, for example, a role of "today" in the above-mentioned news information is "time", and a role of "within community" is "place".
Specifically, when the information acquirer wants to quickly acquire an interesting event from information, the information can be used as a text to be processed, and the text to be processed is acquired, so that the interesting event of the information acquirer can be subsequently extracted through performing event extraction on the text to be processed, and the interesting event can be structurally presented to the information acquirer, so that the effectiveness and the integrity of the acquired event are ensured.
In step S102, an abstract semantic representation is generated according to the text to be processed, where the abstract semantic representation includes nodes corresponding to the words one to one, and edges used for connecting the nodes.
In this embodiment, after the text to be processed is obtained, the abstract semantic representation can be generated according to the text to be processed, so that the text to be processed without labeling is converted into a semantic abstract representation capable of clearly reflecting semantic relations among words in the text, on one hand, the text to be processed without labeling can be understood as being automatically labeled, manual labeling is not needed, the training cost can be reduced, the training data volume is ensured to be sufficient, the model training precision is ensured to a certain extent, and the precision of obtaining events is ensured; on the other hand, because the relationship between the edge and the node in the abstract semantic representation structure is similar to the relationship between the trigger word and the role in the event, the embodiment can generate the abstract semantic representation by the text to be processed, so that the semantic features between each word and the event can be represented by extracting the edge and the node in the abstract semantic representation, and then the trigger word and the role word in the event corresponding to the text to be processed can be accurately acquired through the semantic features between each word and the event, thereby improving the accuracy of acquiring the event.
Further, since the text to be processed is a natural Language instance, and the semantic processing of the natural Language sentence is mainly performed by Natural Language Processing (NLP), where the NLP is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like. Therefore, the present embodiment may generate an Abstract semantic Representation according to the text to be processed to obtain semantic features between words and events in the text to be processed, and specifically, the semantic features may be obtained by a natural language processing technology such as an Abstract semantic Representation parser (AMR) or other Abstract semantic Representation parsers (CAMR) or Abstract semantic Representation predictors (JAMR), which is not specifically limited herein.
Wherein the abstract semantic representation is generated by converting natural language sentences into nodes and edges for node joining, wherein,each node is a word or phrase in a natural language sentence, and different edges represent relationships between different nodes, e.g., a natural language instance s can be parsed into a graph g with node sets and edge sets by an abstract semantic representation parsers=(Vs,Es) Wherein V issIs a set of nodes, EsIs an edge set, u, v represent nodes, and r represents the type of the edge. The relationship types of the edges comprise a special relationship type and a general relationship type, the special relationship type comprises a real-time subject (ARG 0), an acquainted subject (ARG 1), time (time) and a place (location), and the three edge relationship types are similar to the relationship between a trigger word and a role in an event, so that the three edge relationship types are used for extracting semantic features between the word and the event; the general relationship type is motivation (instance).
For example, in a news information scene, the acquired text to be processed is "little sunshine information reports a fire which occurs in the community today", the text to be processed is analyzed by an abstract semantic representation analyzer to obtain a graph structure (AMR structure) shown in fig. 12, the graph structure connects the "report" with the "little sunshine information" through the edge "ARG 0", the "report" with the "fire" through the edge "ARG 1", the "fire" with the "today" through the edge "time", and the "fire" with the "community" through the edge "ARG 1", so that semantic relations between words and events can be clearly and intuitively reflected, and semantic features between words and events can be reflected through the semantic relations between nodes.
Specifically, after the text to be processed is acquired, the text to be processed can be analyzed into semantic abstract representations which can clearly and intuitively reflect semantic relationships between nodes through the abstract semantic analyzer, so that semantic features between words and events can be subsequently represented by extracting semantic relationships between edges and nodes in the abstract semantic representations, and then trigger words and role words of events corresponding to the text to be processed can be accurately acquired through the semantic features between the words and the events, and the accuracy of acquiring the events is improved.
In step S103, semantic coding processing is performed on the abstract semantic representation and the text to be processed to obtain a semantic embedded vector, where the semantic embedded vector is used to represent semantic relationship features between each node.
In this embodiment, because the relationship represented by the three edges ARG, time, and location in the abstract semantic representation is similar to the relationship between the trigger word and the role in the event, and the semantic relationship between each node is reflected by the three special semantic relationships, the semantic features between each word and the event can be reflected by the three special semantic relationships between the nodes, so that, in order to accurately extract the node pairs connected by the three edges from the abstract semantic representation, and further convert the extracted node pairs into the semantic features between the words and the event, the trigger word and the role word in the event can be obtained by the semantic features between the words and the event, so as to improve the precision of obtaining the event, in this embodiment, the abstract semantic representation and the file to be processed are subjected to semantic encoding processing, specifically, by a sequence encoding model, the semantic relationship between each node in the abstract semantic representation is converted into a semantic embedded vector capable of reflecting semantic features between words and events.
For example, the acquired text to be processed is a natural language instance s ═ w1,w2,...,wnIn which wiFor the ith word, a semantic representation matrix of s, namely a semantic embedded vector, can be obtained after the text to be processed passes through a sequence coding model.
Specifically, after the abstract semantic representation is obtained, in order to accurately convert the relationship between nodes and edges in the abstract semantic representation into semantic features between words and events, the obtained abstract semantic representation and the text to be processed may pass through a sequence coding model, so that the semantic relationship between each node in the abstract semantic representation is converted into a semantic embedded vector capable of reflecting the semantic features between the words and the events, and the semantic embedded vector may be further analyzed subsequently, thereby accurately obtaining trigger words and role words in the events corresponding to the text to be processed through the semantic embedded vector capable of reflecting the semantic features between the words and the events, and thus improving the accuracy of obtaining the events.
In step S104, a graph encoding process is performed on the abstract semantic representation to obtain a graph embedding vector, where the graph embedding vector is used to represent semantic relationship features between nodes connected by edges.
In this embodiment, since the structure of the abstract semantic representation is a graph structure as shown in fig. 11, in order to further extract the structural features between any nodes connected by edges in the graph structure, so as to obtain a graph embedding vector that reflects the structural features between words and events by the structural features between any nodes connected by edges, and further accurately obtain trigger words and role words in events corresponding to the text to be processed by the graph embedding vector that can reflect the structural features between words and events, thereby improving the accuracy of obtaining events, in this embodiment, the abstract semantic representation is subjected to graph coding processing, and specifically, the structural features between each node in the abstract semantic representation may be converted into the graph embedding vector that can reflect the structural features between words and events by a graph coding model.
For example, the obtained text to be processed is a natural language instance s ═ w1,w2,...,wnAnd g is obtained after the text to be processed is subjected to abstract semantic representation analysiss=(Vs,Es) Wherein V issIs a set of nodes, EsWhen the abstract semantic representation passes through a graph encoder, g is obtainedsThe representation vector of (1), i.e. the graph embedding vector.
Specifically, after the abstract semantic representation is obtained, in order to more randomly and sufficiently obtain the structural features capable of being used for representing the structural features between each word and each event through the structural features of the nodes and the edges in the graph structure represented by the abstract semantic representation, and further accurately obtain the trigger words and the role words in the events corresponding to the text to be processed through the graph embedding vectors capable of reflecting the structural features between the words and the events, the obtained abstract semantic representation may be passed through a graph coding model to obtain the graph embedding vectors capable of representing the structural features between the nodes connected through the edges to reflect the structural features between each word and each event, so that the graph embedding vectors can be further analyzed subsequently, and the trigger words and the role words in the events corresponding to the text to be processed are accurately obtained through the graph embedding vectors capable of reflecting the structural features between the words and the events, thereby improving the accuracy of the acquisition event.
In step S105, the semantic embedded vector and the graph embedded vector are spliced to obtain a spliced feature vector;
in this embodiment, after the semantic embedded vector and the graph embedded vector are obtained, the semantic embedded vector and the graph embedded vector may be spliced to enrich the expansion of the semantic definition framework of the event, so that the spliced feature vector can be identified subsequently, and the classification of words and role words can be triggered more finely through the learned semantic features and structural features between words and events, thereby improving the accuracy of event extraction.
For example, one semantic embedded vector is obtained as h, one graph embedded vector is obtained as g, and the semantic embedded vector and the graph embedded vector are spliced to obtain one spliced feature vector f ═ h, g.
Specifically, after the semantic embedded vector and the graph embedded vector are acquired, in order to realize finer classification of words and character words through learned semantic features and structural features between words and events, thereby improving the event extraction precision, the semantic embedded vector and the graph embedded vector can be spliced to obtain a spliced feature vector after feature expansion.
In step S106, the spliced feature vectors are identified, and a target event is output, where the target event includes a trigger word and a role word extracted from the N words, the trigger word is used to indicate an event occurring in the text to be processed, and the role word is used to indicate roles of each entity in the text to be processed in the event.
In this embodiment, after the spliced feature vector is acquired, in order to accurately convert semantic features and structural features between words and events in the acquired spliced feature vector into different categories, such as probabilities of trigger words and event types or role words, so as to extract events, the embodiment inputs the spliced feature vector capable of reflecting semantic features and structural features between words and events into an event classifier for recognition, and can realize more detailed classification of trigger words and role words through learned semantic features and structural features between words and events, so as to output information that is interesting to an information acquirer, namely, a target event, so as to improve the accuracy of event extraction.
For example, as shown in fig. 11, after converting the obtained text "little sunshine information reports a fire occurring in the community today" into a spliced feature vector, feature recognition is performed to obtain a trigger word "fire" and a corresponding event type (event type) "disaster", and a role of an argument role (argument role) "today" in the event is "time-within", and a role of the argument role (argument role) "in the community" is "place".
Specifically, after the spliced feature vector is obtained, the spliced feature vector obtained by splicing the semantic embedded vector capable of reflecting the semantic features between the words and the events and the graph embedded vector capable of reflecting the structural features between the words and the events is input into the event classifier to be identified, so that the trigger words and the role words in the events can be classified more finely by the learned semantic features and structural features between the words and the events, and the accuracy of event extraction is improved.
In the embodiment of the application, an event extraction method is provided, and by the above manner, words are converted into nodes and edges which are convenient to identify, the nodes and the edges are converted into semantic embedded vectors which can accurately reflect semantic features between each word and an event, and structural features between the nodes connected through the edges are converted into graph embedded vectors which can reflect structural features between each word and the event, so that the event semantic definition framework is expanded, the learned semantic features and structural features between the words and the events have higher accuracy, and the precision of acquiring target events is improved.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 3, the method further includes:
in step S301, a text to be extracted is subjected to node coding processing to obtain a node coding vector, where the node coding vector is used to initialize semantic features between each word and event;
in step S302, the node encoding vector and the abstract semantic representation are subjected to graph encoding processing by a graph encoding model to obtain a graph embedding vector.
In this embodiment, after the text to be processed is obtained, the text to be processed may be converted into a node coding vector that reflects initialization of semantic features between each word and event, and then the node coding vector that can reflect initialization of semantic features between each word and event and an abstract semantic representation that can reflect a structural relationship between nodes may be converted into structural features between each node by using a graph coding model, and then a graph embedding vector that can reflect structural features between each word and event by using structural features between each node is obtained, so that structural features that are more important in event extraction may be captured by the graph embedding vector subsequently, thereby improving accuracy of event extraction.
The graph coding model may specifically adopt a graph isomorphic model (GIN) or a graph neural network, and the like, and is not limited herein.
For example, one text to be processed is obtained as s ═ w1,w2,...,wnAnd g is obtained after the text to be processed is subjected to abstract semantic representation analysiss=(Vs,Es) I.e. for s ═ w1,w2,...,wnThe node coding can be performed by using Robert tower model (Roberta)And obtaining a node coding vector { h } - }1,h2,...,hn}=RoBERTa(w1,w2,...,wn) Then, the node encoding vector { h } is expressed with the abstract semantic representation gsObtaining a graph embedding vector g ═ GIN (g) through a graph isomorphic model (GIN)s,{h})。
Specifically, after the text to be processed and the abstract semantic representation corresponding to the text to be processed are obtained, the text to be extracted may be encoded and processed to obtain a node encoding vector capable of reflecting semantic feature initialization between each word and an event, and the abstract semantic representation capable of reflecting the structural relationship between nodes is converted into a graph embedding vector capable of reflecting the structural feature between each word and an event through a graph encoding model, so that the structural feature between each node in each graph structure is accurately captured according to the characteristic of the graph structure expressed by the abstract semantic representation, and the structural feature between each word and an event is reflected through the structural feature between each node, and then the semantic relation between a trigger word and a role in an event can be learned more accurately through the graph embedding vector capable of reflecting the structural feature between each word and an event Thereby improving the event extraction precision.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 4, identifying the spliced feature vector, and outputting the target event includes:
in step S401, determining a maximum pooling feature vector of a trigger word and a maximum pooling feature vector of a role word corresponding to the semantic embedded vector according to a dynamic maximum pooling algorithm;
in step S402, the maximum pooling feature vector of the trigger word and the maximum pooling feature vector of the role word are spliced with the spliced feature vector to obtain a feature vector to be identified;
in step S403, the feature vectors to be recognized are classified and recognized to obtain a target event.
In this embodiment, the dynamic max-pooling method is an event classifier under supervised learning, and since the trigger word in the event is generated by directly triggering the event and is an important feature for determining the event category, the method for determining the event by using the max-pooling feature can be used to solve the problem of event determination, which includes two main steps of candidate event type classification and candidate event role classification, and is used to determine the maximum-pooling feature vector of the trigger word and the maximum-pooling feature vector of the role word, specifically, the method can determine the maximum-pooling feature vector of the trigger word and the maximum-pooling feature vector of the role word corresponding to the semantic embedding vector according to the dynamic max-pooling algorithm, i.e. the semantic embedding vector is divided into a candidate trigger word embedding vector and a candidate role word embedding vector, and then perform max-pooling on the candidate trigger word embedding vector and the candidate role word embedding vector respectively, and obtaining the maximum pooling feature vector of the trigger word and the maximum pooling feature vector of the role word.
Wherein, the vector h is embedded in the semantic1,h2,...,ht,...,hnPerforming maximum pooling processing, which may be understood as performing candidate event type classification on the semantic embedded vector to obtain a dynamic maximum pooled aggregate feature, that is, the maximum pooled feature vector of the trigger is represented as:
[xi,t]i=max{[h1]i,...,[ht]i}
[xt+i,n]i=max{[ht+1]i,...,[hn]i}
x=[xi,t,xt+1,n]
wherein h istEmbedding vectors, x, for candidate trigger wordsi,tIs a vector of representations, x, of the first segment sequence, i.e. the sequence preceding the trigger word tt+1,nFor the second sequence, i.e. the representation vector of the sequence after the trigger t, [ x ]n]iTo represent the ith dimension of the vector x, [ h ]n]iThe ith dimension of the vector is embedded for semantics.
Wherein, the vector h is embedded in the semantic1,h2,...,ht,...,ha,...,hnPerforming a maximum pooling treatment, which can be understood asAnd (3) performing candidate event role classification on the semantic embedded vector to obtain dynamic maximum pooling aggregation characteristics, namely expressing the maximum pooling characteristic vector of the role words as follows:
[xi,t]i=max{[h1]i,...,[ht]i}
[xt+i,a]i=max{[ht+1]i,...,[ha]i}
[xa+i,n]i=max{[ha+1]i,...,[hn]i}
x=[xi,t,xt+1,a,xa+1,n]
wherein h istEmbedding vectors for candidate trigger words, haEmbedding vectors, x, for candidate rolesi,tIs a vector of representations, x, of the first segment sequence, i.e. the sequence preceding the trigger word tt+1,aA vector of representations, x, for a second sequence of segments, i.e. a sequence after the trigger t and before the character aa+1,nIs a representation vector of the third sequence, i.e. the sequence after role a, [ x ]n]iTo represent the ith dimension of the vector x, [ h ]n]iThe ith dimension of the vector is embedded for semantics.
Further, the semantic relation features between the trigger words and the roles are expanded by splicing the maximum pooling feature vector of the trigger words and the maximum pooling feature vector of the role words with the spliced feature vector to obtain the feature vector to be identified, namely the obtained dynamic maximum pooling aggregation feature x and the vector g are GIN (g)sAnd h) splicing to obtain a spliced feature vector f to be identified, which is (x, g).
Further, the feature vectors to be identified are classified using a classifier (Softmax):
Pred=Softmax(MLP(f))
wherein f is the feature vector to be identified.
Further, the classifier Softmax is optimized by using a cross entropy loss function:
Loss=CrossEntropy(Pred,Gold)
where Gold is a true label.
Specifically, after the spliced feature vector is obtained, the spliced feature vector for reflecting semantic features and structural features between words and events can be used for classifying trigger words and role words in events more finely, so as to improve the accuracy of event extraction The character words are classified more finely by the aid of the maximum pooling feature vector of the character words and the splicing feature vector capable of reflecting semantic features and structural features between the words and events, and accordingly accuracy of event extraction is improved.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 5, identifying the spliced feature vector, and outputting the target event includes:
in step S501, performing spectral clustering on the spliced feature vectors to obtain a node cluster map, where the node cluster map includes cluster nodes and cluster edges connecting the cluster nodes;
in step S502, an edge weight value between the clustering nodes is determined according to a distance of the clustering edges;
in step S503, performing a graph cutting process on the node clustering graph to obtain K clustering subgraphs, where K is an integer greater than 1;
in step S504, when the edge weight value of each cluster subgraph meets a preset weight value, outputting a target event.
In this embodiment, according to the obtained splicing feature vector, in order to accurately convert the splicing feature vector into a trigger word and an event type, and an argument role, the obtained splicing feature vector may be identified by using a joint clustering method, where the joint clustering method is an event classifier under unsupervised learning; the joint clustering method is an iterative clustering method based on spectral clustering, and specifically can be that firstly, the first spectral clustering is carried out on the spliced characteristic vectors to obtain a node clustering graph comprising clustering nodes and clustering edges connecting the clustering nodes; then, determining an edge weight value between the clustering nodes according to the distance of the clustering edges, wherein the distance of the clustering edges refers to the distance between two clustering nodes, wherein the two clustering nodes with longer distances, namely the weight value of the edge between the two clustering nodes, is lower, the two clustering nodes with shorter distances, namely the weight value of the edge between the two clustering nodes, further, performing graph cutting processing on the node clustering graph according to the edge weight value so that the edge weight sum between different subgraphs after graph cutting is as low as possible, and the edge weight sum in the subgraphs is as high as possible, namely when the distances between the subgraphs are as far as possible and the subgraphs are as similar as possible, the edge weight value of each clustering subgraph accords with a preset weight value, wherein the preset weight value is set according to practical application requirements, and is not particularly limited here, outputting a clustering result of a triggering word cluster and a clustering result of a role cluster, i.e. the target event.
Further, in this embodiment, the obtained splicing feature vector may be identified by using a joint clustering method, or a first spectral clustering may be performed on the splicing feature vector to obtain a node cluster map including cluster nodes and cluster edges connecting the cluster nodes, as a first clustering result, and then, by calculating the overall similarity of the node cluster map, when the overall similarity of the similar result is too low and the similarity of different classes is higher, the similarity may be adjusted by the result of the first spectral clustering, and spectral clustering may be performed again until the similarity meets the requirement or reaches the iteration upper limit, and the clustering result of the trigger word cluster and the clustering result of the role cluster, that is, the target event, may be understood as calculating the distance between each cluster node and the sample by randomly determining the node cluster map, and then, indicating that each cluster node and the sample are of the same class according to the closest distance therebetween, and allocating the sample to the nearest cluster, namely allocating the event type in the data set to one cluster until convergence, and finally outputting the cluster center point of the cluster and the size of the cluster.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, performing semantic coding processing on the abstract semantic representation to obtain a semantic embedded vector includes:
and coding the abstract semantic representation according to a sequence coding model to obtain a semantic embedded vector, wherein the sequence coding model is used for carrying out time-sequence coding on nodes connected through edges.
In this embodiment, after acquiring a text to be processed and an abstract semantic representation corresponding to the text to be processed, in order to accurately extract semantic features between each node included in the text to be processed and the abstract semantic representation, that is, accurately convert the semantic features between each node into semantic features between words and events, the present embodiment uses a sequence coding model to time-sequentially encode the nodes connected by edges for the text to be processed and the abstract semantic representation, such as a bidirectional mask pre-training language model (BERT), a RoBERTa model (RoBERTa), or a permutation pre-training language model (XLNET), without specific limitations, to accurately acquire a semantic embedded vector having time-sequence and capable of representing the semantic features between words and events, so that subsequently, by further identifying the semantic embedded vector, events corresponding to the text to be processed can be accurately identified by the semantic features between words and events, thereby improving the accuracy of the acquisition event.
For example, for one obtained text to be processed, s ═ w1,w2,...,wnAnd g is obtained after abstract semantic representation analysis is carried out on the text to be processeds=(Vs,Es) And further put s ═ w1,w2,...,wnAnd gs=(Vs,Es) Use ofSemantic coding is carried out by a Robert tower model (Roberta), namely, a word sequence is coded into a vector representation sequence, and specifically, a multilayer bidirectional transformation coder is adopted in the Robert tower model (Roberta) to obtain a semantic representation matrix { h } of a hidden vector1,h2,...,hn}=RoBERTa(w1,w2,...,wn) I.e. semantically embedding vectors, where each wiRepresents a word, hiIs simply wiThe corresponding hidden vector.
Specifically, after the text to be processed and the abstract semantic representation of the text to be processed are obtained, a multi-layer bidirectional transform encoder in a RoBERTa model (RoBERTa) in a sequence coding model may be adopted to obtain a semantic representation matrix of a hidden vector, that is, a semantic embedded vector which has time sequence and can represent semantic features between words and events.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 6, the method further includes:
in step S601, obtaining a corpus to be processed in a database, where the corpus to be processed includes M sentences, and M is an integer greater than or equal to 1;
in step S602, an abstract semantic representation set is generated from the corpus to be processed, where the abstract semantic representation set includes a set of nodes corresponding to the sentence and a set of edges formed by connecting edges between the nodes.
In this embodiment, in order to enable the model to better learn the semantic features and structural features between words and events, and pre-train the model, so as to further improve the learning capability of the model, the embodiment is implemented by pre-processing the corpus to be processed before pre-training the model, so as to convert the corpus to be processed into data in the same format, and convert the un-labeled data into the semantic relationship and structural relationship between nodes and edges, so as to implement automatic labeling of data, without manual labeling, which not only can reduce the training cost, but also can ensure sufficient training data amount, thereby improving the learning capability of the model to a certain extent, and avoiding the situation that the semantic relationship between trigger words and characters is limited due to the high-cost manual labeling data set, so as to reduce the learning capability of the model, in the embodiment, a set formed by abstract semantic representations of a batch of texts capable of reflecting semantic relationships among words, namely an abstract semantic representation set, is obtained by obtaining the linguistic data to be processed in a database, namely a batch of unlabeled natural language instances, and performing abstract semantic representation on the linguistic data to be processed in a concrete processing mode, wherein the abstract semantic representation of the texts to be processed is the same and is not redundant here, so that a model training sample is expanded, the learning capability of a model is improved, and the accuracy of event extraction is improved.
For example, as shown in fig. 11, the generating of the abstract semantic representation set from the corpus to be processed is a graph structure that parses unlabeled data, such as sentences in the corpus to be processed, into one abstract semantic representation, where nodes of the graph structure represent corresponding concepts and edges represent semantic relationships between different concepts; wherein, a concept is basically a word, for an entity phrase, the graph structure will connect the words in the phrase together by using a main (name) edge and an operation (operation) edge, and can combine the edges together, i.e. the entity phrase becomes a node; it can be understood that, since the relationship represented by the three edges ARG, time and location in the abstract semantic representation structure is similar to the relationship between the trigger word and the role in the event, the semantic relationship between each node can be reflected by the three special semantic relationships, furthermore, the semantic features between each word and event can be reflected by three special semantic relations between nodes, so that the embodiment can generate an abstract semantic representation set by the linguistic data to be processed, so that the semantic features between the words and the events can be converted subsequently by extracting the node pairs connected by the three edges in each abstract semantic representation structure in the abstract semantic representation set, and furthermore, semantic features between trigger words and roles in the events are more accurately reflected through the semantic features between the words and the events, so that the learning capacity of the model is improved, and the accuracy of obtaining the events is improved.
Wherein, the graph structure of the abstract semantic representation can be understood as a single directed acyclic tree; wherein the purpose of the single root, such as "report-01" illustrated in FIG. 11, is to ensure the integrity of sentence semantics; "directed" is to ensure semantic delivery; "Loop-free" is to avoid semantic passing into dead loops.
Specifically, before the pre-training of the model is performed, in order to avoid the situation that the model learning capability is reduced due to the fact that the semantic relationship between the model learning trigger word and the role is limited due to the high-cost manual tagging of the data set, the embodiment performs abstract semantic representation on the unmarked corpus to be processed, and obtains a batch of abstract semantic representation sets capable of reflecting semantic features between words and events through the semantic relationship between nodes, so as to expand the training data set, improve the learning capability of the model, and improve the precision of extracting the parts.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 7, the method further includes:
in step S701, a contrast learning training data set is generated according to the abstract semantic representation set;
in step S702, the basic model is pre-trained according to the comparative learning training data set to obtain a training model.
In this embodiment, the comparative learning training data set is obtained by constructing corresponding positive examples and negative examples from the obtained abstract semantic expression set based on a comparative learning method, where the positive examples are used to approximately represent affinity relationships, the negative examples are used to approximately represent distancing relationships, the comparative learning training data set includes the comparative learning semantic training data set and the comparative learning diagram training data set, and may also include other data sets constructed based on the comparative learning method, and no specific limitation is made here.
Specifically, after the abstract semantic representation set is obtained, in order to enable the basic model to better learn semantic features and structural features between words and events, the basic model may be pre-trained, so as to further improve the learning capability of the model, where the basic model may be a pre-constructed graph isomorphism model (GIN), a graph neural network, a two-way mask pre-training language model (BERT), a RoBERTa model (RoBERTa), or a permutation pre-training language model (XLNET), and the like are not specifically limited, in this embodiment, when the model is pre-trained, first, the obtained abstract semantic representation set is used to construct corresponding positive examples and negative examples based on a contrast learning method, and the constructed positive examples and negative examples are used as a contrast learning training data set, so that the learning capability of the basic model can be improved by performing pre-training on the contrast learning data set subsequently, and obtaining a trained training model, thereby improving the precision of event extraction.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 8, the method further includes:
in step S801, a semantic positive example is constructed according to the edge, the node, the edge set, and the node set, where the semantic positive example is used to represent semantic affinity between the trigger word and the role;
in step S802, a node replacement operation is performed on the semantic positive case to obtain a semantic negative case, where the semantic negative case is used to represent a semantic distance relationship between a trigger word and a role;
in step S803, the semantic positive case and the semantic negative case are used as the comparative learning semantic training data set.
In this embodiment, because the relationship represented by the three edges ARG, time, and location in the abstract semantic representation is similar to the relationship between the trigger word and the role in the event, and the semantic relationship between each node is reflected by the three special semantic relationships, the semantic features between each word and the event can be reflected by the three special semantic relationships between the nodes, so that the node pairs connected by the three edges can be accurately extracted from the abstract semantic representation, and the extracted node pairs are converted into the semantic features between the words and the event, so that the trigger word and the role word in the event can be acquired by the semantic features between the words and the event, thereby improving the accuracy of acquiring the eventAfter the abstract semantic representation set is obtained, the volume can be based on a contrast learning method by firstly marking edges, nodes, edge sets and node sets as Ps={(u,v)|(u,v,r)∈Es,r∈Rp},Rp(iii) ARG, time, location, and PsThe semantic positive example is a positive example capable of approximately representing the semantic affinity between the trigger word and the character.
Further, in order to learn semantic features between each word and event more accurately so that semantic relationships between trigger words and characters in the event can be reflected by the learned semantic features between the words and the event in the following process, the present embodiment constructs a negative example, i.e., a semantic negative example, which can be used to approximately represent a semantic distant relationship between the trigger words and the character pairs, by replacing the trigger words or the characters of the node pairs. It can be understood that assume (t, a) ∈ PsWherein t is a trigger word and a is a node pair of the role, then t can be replaced by t
Figure BDA0003073843970000151
Wherein,
Figure BDA0003073843970000152
satisfy the requirement of
Figure BDA0003073843970000153
r∈RpAfter replacement, a negative example can be obtained
Figure BDA0003073843970000154
Similarly, a negative example can be obtained by replacing roles
Figure BDA0003073843970000155
Wherein,
Figure BDA0003073843970000156
satisfy the requirement of
Figure BDA0003073843970000157
r∈RpAnd therefore, by replacing the trigger word,can construct mtNegative examples, and by replacing roles, m can be constructedaAnd the set of the negative examples is the negative example which can approximately represent the semantic distant relation between the trigger word and the role, namely the semantic positive and negative examples.
Specifically, after the abstract semantic expression set is obtained, a semantic positive example capable of approximately representing the semantic close relationship between the trigger word and the role is constructed by the edge, the node, the edge set and the node set, a negative example capable of approximately representing the semantic far relationship between the trigger word and the role is obtained by performing node replacement operation on the semantic positive example, and then the semantic positive example and the semantic negative example are used as a comparison learning semantic training data set, so that the comparison learning semantic training data set can be subsequently used for training the model, the model can learn more semantic relationships between the trigger word and the role through the comparison learning semantic training data set, the learning capability of the model is further improved, the accuracy of model prediction is enhanced, and the precision of obtaining events is further improved.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, the method further includes:
and carrying out reverse training on the basic sequence coding model according to the comparative learning semantic training data set and the loss function to obtain the sequence coding model.
In this embodiment, after the comparative learning semantic training data set is obtained, in order to enable the model to better learn the semantic features between words and events, the model may be pre-trained according to the comparative learning semantic training data set, and the trained basic coding model may be reversely trained according to the following loss function and in combination with the semantic representation matrix, so as to obtain the optimized sequence coding model, improve the learning ability of the sequence coding model to the semantic features between words and events, and thus improve the accuracy of event extraction:
Figure BDA0003073843970000158
where W is a learnable matrix defining a similarity measure between the representation vectors, htIs a representative vector of the trigger word t, i.e. a hidden vector of the trigger word, haIs the representation vector of character a, i.e. the hidden vector of the character,
Figure BDA0003073843970000159
is the ith replacement trigger
Figure BDA00030738439700001510
Is the hidden vector that replaces the trigger word, and log is a logarithmic function based on e, exp is an exponential function based on e.
Further, by adding all the loss functions together, the final loss function:
Figure BDA00030738439700001511
wherein t represents a trigger word, a represents a role, B represents a corpus to be processed, and PsIs a semantic positive example.
Specifically, after the contrast learning semantic training data set is obtained, the model can be pre-trained according to the contrast learning semantic training data set, repeated learning of the materials in the training process can be avoided, the learning cost is reduced, then the basic sequence model can be reversely trained by combining the superposed semantic expression matrix with the loss function, iterative optimization can be continuously carried out on the basic sequence model until convergence is achieved, the optimized sequence coding model is obtained, the learning capability of the sequence coding model on semantic features between words and events can be improved, and therefore the accuracy of event extraction is improved.
It should be noted that, as shown in fig. 13, it can be understood as an evaluation result obtained by evaluating a sequence coding model obtained after pre-training based on a contrast learning method, and it can be understood that in supervised learning, as shown in (a) of fig. 13, after Event Extraction is performed on an ACE2005 data set by various models, including expression of Event Detection (ED) and Event role Extraction (EAE), and as shown in (b) of fig. 13, expression of Event Detection (ED) is performed on a data set (MAVEN), where a model CLEVE is expressed as a sequence coding model obtained after reverse training is performed on a basic coding model according to a loss function and a semantic representation matrix in this embodiment, that is, a model obtained after pre-training based on contrast learning is added. By taking the F1 index as an evaluation basis, the model performance is remarkably improved after the pre-training based on contrast learning is added, wherein P represents accuracy (Precision), R represents Recall (Recall), F1 score is the harmonic mean of P and R, ED represents Event Detection (Event Detection), and EAE represents Event role Extraction (Event identification).
Further, as shown in fig. 14, it can be understood that in the unsupervised learning, as shown in fig. 14 (a), various models perform Event extraction on an ACE2005 data set, wherein the various models include expressions of Event Detection (ED) and Event role extraction (EAE), and as shown in fig. 14 (b), expressions of Event Detection (ED) on a MAVEN data set, wherein the model (CLEVE) is represented as a sequence coding model obtained by reversely training a basic coding model according to a loss function and a semantic representation matrix in the present embodiment, that is, a model obtained by adding a pre-training based on the contrast learning. By taking the data (B-cube) index as an evaluation basis, the model performance is obviously improved after the pre-training based on the comparative learning is added.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 9, the method further includes:
in step S901, performing two times of random graph sampling on each abstract semantic representation in the abstract semantic representation set to obtain 2M random sampling subgraphs, where each random sampling subgraph includes a target node and a sub-node, and each abstract semantic representation corresponds to two random sampling subgraphs;
in step S902, taking the 2M randomly sampled subgraphs as a graph positive example, where the graph positive example is used to represent semantic affinity of the sub-nodes with respect to the target node;
in step S903, performing random combination processing on the randomly sampled subgraphs to obtain X randomly combined subgraphs, where each randomly combined subgraph includes two randomly sampled subgraphs, the two randomly sampled subgraphs in the randomly combined subgraphs respectively correspond to different sampling semantic representations, and X is an integer not equal to 2M;
in step S904, the X random combination subgraphs are taken as a negative graph example, which is used to represent the semantic distancing relationship of the sub-nodes with respect to the target node;
in step S905, the positive example and the negative example are used as a comparative learning graph training data set.
In this embodiment, after the sampling representation set is obtained, each abstract semantic representation graph structure may be regarded as an original graph, and then, in order to better learn the structural relationship between nodes in the pre-training process of the subsequent model, the randomness of the structural relationship in the graph structure is enhanced in this embodiment, so as to reduce the easy obtaining of the subsequent model in the pre-training process and improve the learning capability of the model, thereby improving the accuracy of event extraction.
Further, in order to further improve the learning ability of the learning structure features of the model, the embodiment is implemented by pre-training the model by using a contrast learning graph training data set constructed based on a contrast learning method, first, each abstract semantic representation in the abstract semantic representation set is subjected to random graph sampling twice, specifically, one abstract semantic representation is selected, any node of the abstract semantic representation is used as a target node, and the division is divided into two or more nodesOther nodes except the selected target node are used as sub-nodes of the abstract semantic representation, further, random graph sampling is carried out on the selected target sub-node to obtain a random sampling subgraph, similarly, random graph sampling is carried out on M abstract semantic representations twice respectively to obtain 2M random sampling subgraphs, M is an integer greater than or equal to 1, the 2M random sampling subgraphs are used as graph positive examples which can be used for approximately representing the semantic affinity of the sub-node relative to the target node, for example, the ith sampling semantic representation g in the sampling semantic representation setiRespectively carrying out random sampling twice to obtain giCorresponding two random sampling subgraphs such as a2i-1And a2iWhere i is an integer less than or equal to 2M, and can be sub-graph a, which is a randomly sampled sub-graph2i-1And a2iAs a positive example of a pair of graphs, the positive example of the graph is used for approximately representing semantic affinity of a sub-node relative to a target node, then, two random sampling subgraphs are randomly combined (such as arrangement combination or cross combination, etc., without limitation), and the two combined random sampling subgraphs correspond to different sampling semantic representations, for example, it is assumed that a random sampling subgraph ajThe corresponding abstract semantic representation is gjRandom sampling of akThe corresponding abstract semantic representation is gkWhen randomly sampling subgraph ajAnd random sampling subgraph akCombining them together to obtain a random combined subgraph, where j or k is an integer not equal to i, and similarly, combining 2M random sampling subgraphs to obtain X random combined subgraphs, e.g., arranging and combining 2M random sampling subgraphs to obtain
Figure BDA0003073843970000171
A random combination subgraph, wherein y is an integer less than or equal to M, and further taking the obtained X random combination subgraphs as graph negative examples which can be used for approximately representing the semantic distancing relationship of the sub-nodes relative to the target node, for example, a randomly sampled subgraph ajAnd random sampling subgraph akAnd combined together, a random combined subgraph can be obtained as a negative example of a pair of graphs,the negative graph example is used for approximately representing the semantic far relationship of the sub-nodes relative to the target node, and then the positive graph example and the negative graph example can be constructed based on a comparison learning method to enhance the randomness of the model learning structure relationship, so that the situation that the model is obtained easily in the process of pre-training through the comparison learning graph training data set subsequently is reduced, the capability of structural features in the learning graph structure of the model is improved, and the accuracy of event extraction is improved.
Specifically, after the sampling representation set is obtained, based on a contrast learning method, firstly, performing random graph sampling twice on each abstract semantic representation in the abstract semantic representation set to obtain two random sampling subgraphs respectively corresponding to each sampling semantic representation, taking the random sampling subgraphs as a graph positive example, then respectively performing random combination on the random sampling subgraphs corresponding to two different sampling semantic representations to obtain random combination subgraphs, wherein the two random sampling subgraphs included in each random combination subgraph do not correspond to the same sampling semantic representation, further taking the random combination subgraphs as graph negative examples, and taking the graph positive examples and the graph negative examples as a contrast learning graph training data set, so that the easy-to-solve obtaining rate is reduced in the process of pre-training a model through the contrast learning graph training data set, the ability of the model to learn structural features in the graph structure is improved, thereby improving the accuracy of event extraction.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, as shown in fig. 10, the method further includes:
in step S1001, any node in the abstract semantic representation is taken as a target node, and nodes other than the target node are taken as child nodes;
in step S1002, a random walk is performed from the target node to the child nodes to obtain a random node set, where the random node set includes the target node, the child nodes, and edges formed from the target node to the child nodes;
in step S1003, a node subgraph is drawn according to the random node set;
in step S1004, the target node and the sub-node in the node subgraph are numbered to obtain a randomly sampled subgraph.
In this embodiment, the graph sampling includes three steps of random walk with restart behavior, subgraph inference and anonymization, wherein the random walk with restart behavior is implemented by taking any node in the abstract semantic representation as a target node, the target node needs to satisfy a condition that the degree of income is 0, taking nodes except the target node as child nodes, then taking the original graph as an undirected graph and starting to walk through equal probability, for example, walking from a to B, C, D, the probability of which is one third, and then returning to the starting point and restarting to walk through a preset fixed probability, wherein the preset fixed probability is usually set to 0.8, or the range can be set to 0.0 to 1.0, without specific limitation, so that the range of random walk exploration is wider, the sampled subgraph is more random, and then, when the target node, namely the child node adjacent to the starting point is all walked, and finishing walking to obtain a random node set comprising the target node, the child nodes and edges formed by moving the target node to the child nodes.
Further, the subgraph inference is to draw according to the acquired random node set to obtain a graph covering target nodes and sub-nodes in the random node set and edges formed from the target nodes to the sub-nodes, that is, a node subgraph.
Further, anonymization is to renumber the obtained node subgraphs to obtain random sampling subgraphs, which can be used for preventing searching for simple solutions during pre-training and improving the capability of structural features in the learning graph structure of the model, thereby improving the accuracy of event extraction.
Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the event extraction method provided in the embodiment of the present application, the method further includes:
and carrying out reverse training on the basic graph coding model according to the comparative learning graph training data set and the loss function to obtain the graph coding model.
Specifically, after the comparative learning diagram training data set is obtained, in order to enable the model to better learn the structural features between words and events, the model may be pre-trained according to the comparative learning diagram training data set, and the following loss function is adopted to perform reverse training on the basic diagram coding model to obtain the optimized diagram coding model, so as to improve the learning ability of the sequence coding model on the structural features between words and events, thereby improving the accuracy of event extraction:
Figure BDA0003073843970000181
wherein m is the number of the input batch of abstract semantic representation diagrams,
Figure BDA0003073843970000182
and a2iRepresenting a randomly sampled subgraph a corresponding to the ith abstract semantic representation which is randomly extracted from m abstract semantic representations2i-1And a2iThe representation vector obtained by the graph encoder, ajRandom sampling subgraph ajThe representation vector obtained by the graph encoder, 1j≠2i-1Is an index function, 1 when j ≠ 2i-1, and 0 otherwise.
Referring to fig. 15, fig. 15 is a schematic diagram of an embodiment of an event extraction device in an embodiment of the present application, and the event extraction device 20 includes:
the acquiring unit 201 is configured to acquire a to-be-processed text, where the to-be-processed text includes N words, and N is an integer greater than 1;
the generating unit 202 is configured to generate an abstract semantic representation according to the text to be processed, where the abstract semantic representation includes nodes corresponding to the words one to one, and edges between the nodes are used to connect;
the processing unit 203 is configured to perform semantic coding processing on the abstract semantic representation and the text to be processed to obtain a semantic embedded vector, where the semantic embedded vector is used to represent semantic features between each word and an event;
the processing unit 203 is further configured to perform graph coding processing on the abstract semantic representation to obtain a graph embedding vector, where the graph embedding vector is used to represent structural features between nodes connected by edges;
the processing unit 203 is further configured to splice the semantic embedded vector and the graph embedded vector to obtain a spliced feature vector;
and the processing unit 203 is configured to identify the spliced feature vector and output a target event, where the target event includes a trigger word and a role word extracted from the N words, the trigger word is used to indicate an event occurring in the text to be processed, and the role word is used to indicate roles of each entity in the text to be processed in the event.
In the embodiment of the application, an event extraction device is provided, and through the manner, words are converted into nodes and edges which are convenient to identify, then the nodes and the edges are converted into semantic embedded vectors capable of reflecting semantic features between each word and an event, and structural features between the nodes connected through the edges are converted into graph embedded vectors capable of reflecting structural features between each word and the event, so that the event semantic definition framework is expanded, the learned semantic features and structural features between the words and the events have higher accuracy, and the precision of obtaining target events is improved.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the processing unit 203 is further configured to perform node coding processing on the text to be extracted to obtain a node coding vector, where the node coding vector is used to initialize semantic features between each word and event;
the processing unit 203 is further configured to perform graph coding processing on the node coding vector and the abstract semantic representation through a graph coding model to obtain a graph embedding vector.
In this embodiment, the processing unit 203 may perform node encoding processing on the acquired text to be processed to obtain a node encoding vector that can facilitate recognition of the graph encoding model and initialize semantic features between each word and event, and then the processing unit 203 may perform graph encoding processing on the node encoding vector and the abstract semantic representation through the graph encoding model to convert the structural features between each node into a graph embedding vector that can reflect the structural features between each word and event, so that the more important structural features in event extraction can be captured through the graph embedding vector subsequently, thereby improving the accuracy of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the determining unit 204 is configured to determine a maximum pooling feature vector of a trigger word and a maximum pooling feature vector of a role word corresponding to the semantic embedded vector according to a dynamic maximum pooling algorithm;
the processing unit 203 is specifically configured to splice the maximum pooling feature vector of the trigger word and the maximum pooling feature vector of the role word with the spliced feature vector to obtain a feature vector to be identified;
and the output unit 205 is configured to perform classification and identification on the feature vectors to be identified to obtain a target event.
In this embodiment, the determining unit 204 may perform dynamic maximum pooling on the obtained semantic embedded vector to realize event type classification and event role classification on the semantic embedded vector, so as to obtain a maximum pooled feature vector of a trigger word and a maximum pooled feature vector of a role word, then, the processing unit 203 splices the maximum pooled feature vector of the trigger word and the maximum pooled feature vector of the role word with the spliced feature vector to obtain a feature vector to be recognized, and further, an event classifier is adopted in the output unit 205 to perform classification and recognition on the feature vector to be recognized, so as to realize more precise classification on the trigger word and the role word in the event, thereby improving the precision of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the processing unit 203 is specifically configured to perform spectral clustering on the spliced feature vectors to obtain a node cluster map, where the node cluster map includes cluster nodes and cluster edges connecting the cluster nodes;
the determining unit 204 is further configured to determine edge weight values between the clustering nodes according to the distance between the clustering edges;
the processing unit 203 is specifically configured to perform graph cutting processing on the node cluster graph to obtain K cluster subgraphs, where K is an integer greater than 1;
the output unit 205 is further configured to output the target event when the edge weight value of each cluster subgraph meets a preset weight value.
In this embodiment, after the processing unit 203 performs spectral clustering on the obtained spliced feature vectors, and performs graph cutting processing on a node cluster graph obtained by the spectral clustering to obtain K cluster subgraphs, the determining subunit 205 determines edge weight values between the cluster nodes, then, the output unit 205 detects the edge weight value of each cluster subgraph, and when it is detected that the edge weight value of a cluster subgraph conforms to a preset weight value, it can be understood that the cluster subgraph belongs to a cluster corresponding to the preset weight value, where the preset weight value and the cluster have a corresponding relationship, and the output unit 205 outputs a clustering result of a trigger word cluster and a clustering result of a role cluster, so as to realize more precise classification of the trigger word and the role word in an event, thereby improving the precision of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the processing unit 203 is further configured to encode the abstract semantic representation according to a sequence coding model to obtain a semantic embedded vector, where the sequence coding model is used to perform time-sequence coding on nodes connected by edges.
In this embodiment, the processing unit 203 uses a multi-layer bidirectional transform encoder in a RoBERTa model (RoBERTa) of a sequence coding model to obtain a semantic representation matrix of a hidden vector, i.e. a semantic embedded vector which has a time sequence and can represent semantic features between words and events, so that semantic features more important in event extraction can be captured by the semantic embedded vector subsequently, thereby improving the accuracy of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the obtaining unit 201 is further configured to obtain a corpus to be processed in the database, where the corpus to be processed includes M sentences, and M is an integer greater than or equal to 1;
the generating unit 202 is further configured to generate an abstract semantic representation set from the corpus to be processed, where the abstract semantic representation set includes a set of nodes corresponding to the sentence and an edge set formed by edges connecting the nodes.
In this embodiment, a batch of linguistic data to be processed in the database is obtained through the obtaining unit 201, and in order to enable the model to better learn semantic features and structural features between words and events, the linguistic data to be processed may be preprocessed through the generating unit 202 before the model is pre-trained, specifically, the linguistic data to be processed may be subjected to abstract semantic representation to obtain an abstract semantic representation set, so that a training sample for model training is subsequently constructed, the learning capability of the model is improved, and the accuracy of event extraction is improved.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the generating unit 202 is further configured to generate a contrast learning training data set according to the abstract semantic representation set.
And the training unit 206 is configured to pre-train the basic model according to the comparative learning training data set to obtain a training model.
In this embodiment, a contrast learning training data set is constructed by the generation unit 202 by using a contrast learning method on the obtained abstract semantic expression set, where the contrast learning training data set includes a positive example capable of approximately representing an affinity and a negative example capable of approximately representing a distance, and then, in order to enable the basic model to better learn semantic features and structural features between words and events, the training unit 206 pre-trains the basic model with the contrast learning training data set to improve the learning capability of the basic model, obtain a trained training model, and thus improve the accuracy of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the constructing unit 207 is configured to construct a semantic positive example according to the edge, the node, the edge set, and the node set, where the semantic positive example is used to represent a semantic affinity relationship between the trigger word and the role;
the constructing unit 207 is further configured to perform node replacement operation on the semantic positive case to obtain a semantic negative case, where the semantic negative case is used to represent a semantic distance relationship between a trigger word and a role;
the constructing unit 207 is further configured to use the semantic positive case and the semantic negative case as a comparative learning semantic training data set.
In this embodiment, the construction unit 207 adopts a contrast learning method to construct a semantic positive example and a semantic negative example, and the semantic positive example and the semantic negative example are used as a contrast learning semantic training data set, so that the model can be trained subsequently by using the contrast learning semantic training data set, and the model can learn more semantic relationships between trigger words and roles through the contrast learning semantic training data set, thereby improving the learning capability of the model, enhancing the accuracy of model prediction, and improving the accuracy of event acquisition.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the training unit 206 is specifically configured to perform reverse training on the basic sequence coding model according to the comparison learning semantic training data set and the loss function, so as to obtain a sequence coding model.
In this embodiment, the training unit 206 is used to acquire a basic sequence coding model by using the acquired comparative learning semantic training data set, and then, the loss function in the training unit 206 and the acquired semantic representation matrix are combined to perform reverse training on the trained basic sequence coding model so as to acquire an optimized sequence coding model, thereby improving the learning capability of the sequence coding model on semantic features between words and events and improving the accuracy of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the constructing unit 207 is further configured to perform random graph sampling twice on each abstract semantic representation in the abstract semantic representation set to obtain 2M random sampling subgraphs, where each random sampling subgraph includes a target node and a sub-node, and each abstract semantic representation corresponds to two random sampling subgraphs;
the constructing unit 207 is further configured to take the 2M randomly sampled subgraphs as a graph positive example, where the graph positive example is used to represent semantic affinity of the sub-nodes with respect to the target node;
the constructing unit 207 is further configured to perform random combination processing on the randomly sampled subgraphs to obtain X randomly combined subgraphs, where each randomly combined subgraph includes two randomly sampled subgraphs, the two randomly sampled subgraphs in the randomly combined subgraphs respectively correspond to different sampling semantic representations, and X is an integer not equal to 2M;
the constructing unit 207 is further configured to use the X random combination subgraphs as a negative graph example, where the negative graph example is used to represent a semantic distancing relationship of the sub-node with respect to the target node;
the constructing unit 207 is further configured to use the positive example and the negative example as a comparative learning graph training data set.
In this embodiment, the positive graph example and the negative graph example are constructed by the construction unit 207 through a comparison learning method, and are used as a comparison learning graph training data set, so that the comparison learning graph training data set can be subsequently used to train the basic model, and the basic model can learn more structural relationships between trigger words and characters through the comparison learning graph training data set, thereby improving the learning capability of the basic model, enhancing the accuracy of basic model prediction, and improving the accuracy of event acquisition.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the determining unit 204 is further configured to use any node in the abstract semantic representation as a target node, and use nodes other than the target node as child nodes;
the processing unit 203 is further configured to perform random walking from the target node to the child nodes to obtain a random node set, where the random node set includes the target node, the child nodes, and edges formed from the target node to the child nodes;
the processing unit 203 is further configured to draw a node subgraph according to the random node set;
the processing unit 203 is further configured to number the target node and the sub-node in the node subgraph to obtain a randomly sampled subgraph.
In this embodiment, a target node performing random walking in the abstract semantic representation and a sub-node corresponding to the target node are determined by the determining unit 204, and then the processing unit 203 sequentially performs random walking with a restart behavior, sub-graph inference and anonymization on the obtained target node and the sub-node to obtain a random sampling sub-graph, which can be used to prevent finding a simple solution when a base model is pre-trained by using a comparison graph training data set, improve the capability of learning structural features in a graph structure of the base model, and thereby improve the accuracy of event extraction.
Alternatively, on the basis of the embodiment corresponding to fig. 15, in another embodiment of the event extraction device provided in the embodiment of the present application,
the training unit 206 is specifically configured to perform reverse training on the base graph coding model according to the comparison learning graph training data set and the loss function, so as to obtain the graph coding model.
In this embodiment, the training unit 206 is used to acquire a base graph coding model by using the acquired comparative learning semantic training data set, and then the loss function in the training unit 206 is used to perform reverse training on the trained base graph coding model to acquire an optimized graph coding model, so that the learning capability of the graph coding model on the structural features between words and events is improved, and the accuracy of event extraction is improved.
In another aspect, a schematic diagram of a computer device is provided, as shown in fig. 16, for convenience of description, only a part related to the embodiment of the present invention is shown, and details of the method are not disclosed. The processing device applied to data recovery may be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a Point of Sales (POS), a vehicle-mounted computer, and the like, taking the processing device applied to data recovery as a mobile phone as an example:
fig. 16 is a block diagram illustrating a partial structure of a mobile phone related to a processing device applied to data recovery according to an embodiment of the present invention. Referring to fig. 16, the cellular phone includes: radio Frequency (RF) circuit 310, memory 320, input unit 330, display unit 340, sensor 350, audio circuit 360, wireless fidelity (WiFi) module 370, processor 380, and power supply 390. Those skilled in the art will appreciate that the handset configuration shown in fig. 16 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each component of the mobile phone in detail with reference to fig. 16:
the RF circuit 310 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 380; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuit 310 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuit 310 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.
The memory 320 may be used to store software programs and modules, and the processor 380 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 320. The memory 320 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 320 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 330 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 330 may include a touch panel 331 and other input devices 332. The touch panel 331, also referred to as a touch screen, can collect touch operations of a user (e.g., operations of the user on the touch panel 331 or near the touch panel 331 using any suitable object or accessory such as a finger, a stylus, etc.) on or near the touch panel 331, and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 331 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 380, and can receive and execute commands sent by the processor 380. In addition, the touch panel 331 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 330 may include other input devices 332 in addition to the touch panel 331. In particular, other input devices 332 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 340 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The Display unit 340 may include a Display panel 341, and optionally, the Display panel 341 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 331 can cover the display panel 341, and when the touch panel 331 detects a touch operation on or near the touch panel 331, the touch panel is transmitted to the processor 380 to determine the type of the touch event, and then the processor 380 provides a corresponding visual output on the display panel 341 according to the type of the touch event. Although in fig. 16, the touch panel 331 and the display panel 341 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 331 and the display panel 341 may be integrated to implement the input and output functions of the mobile phone.
The handset may also include at least one sensor 350, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 341 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 341 and/or the backlight when the mobile phone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.
Audio circuitry 360, speaker 361, microphone 362 may provide an audio interface between the user and the handset. The audio circuit 360 may transmit the electrical signal converted from the received audio data to the speaker 361, and the audio signal is converted by the speaker 361 and output; on the other hand, the microphone 362 converts the collected sound signals into electrical signals, which are received by the audio circuit 360 and converted into audio data, which are then processed by the audio data output processor 380 and then transmitted to, for example, another cellular phone via the RF circuit 310, or output to the memory 320 for further processing.
WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 370, and provides wireless broadband internet access for the user. Although fig. 16 shows the WiFi module 370, it is understood that it does not belong to the essential constitution of the handset, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The processor 380 is a control center of the mobile phone, connects various parts of the whole mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory 320, thereby performing overall monitoring of the mobile phone. Optionally, processor 380 may include one or more processing units; optionally, processor 380 may integrate an application processor, which primarily handles operating systems, user interfaces, application programs, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 380.
The handset also includes a power supply 390 (e.g., a battery) for powering the various components, optionally, the power supply may be logically connected to the processor 380 through a power management system, so that the power management system may be used to manage charging, discharging, and power consumption.
Although not shown, the mobile phone may further include a camera module, a bluetooth module, etc., which will not be described herein.
In the embodiment of the present invention, the processor 380 included in the terminal device is configured to execute the steps in the embodiments corresponding to fig. 2 to fig. 10.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the steps in the method as described in the embodiments shown in fig. 2 to 10.
Another aspect of the application provides a computer program product comprising instructions which, when run on a computer or processor, cause the computer or processor to perform the steps of the method as described in the embodiments shown in fig. 2 to 10.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (15)

1. An event extraction method, comprising:
acquiring a text to be processed, wherein the text to be processed comprises N words, and N is an integer greater than 1;
generating abstract semantic representations according to the texts to be processed, wherein the abstract semantic representations comprise nodes which are in one-to-one correspondence with the words and edges which are used for connecting the nodes;
performing semantic coding processing on the abstract semantic representation and the text to be processed to obtain a semantic embedded vector, wherein the semantic embedded vector is used for representing semantic features between each word and event;
carrying out graph coding processing on the abstract semantic representation to obtain a graph embedding vector, wherein the graph embedding vector is used for representing structural features between the nodes connected through the edges;
splicing the semantic embedded vector and the graph embedded vector to obtain a spliced feature vector;
and identifying the spliced feature vector, and outputting a target event, wherein the target event comprises a trigger word and a role word extracted from the N words, the trigger word is used for indicating the event occurring in the text to be processed, and the role word is used for indicating the role of each entity in the text to be processed in the event.
2. The method of claim 1, wherein prior to the graph coding the abstract semantic representation into a graph embedding vector, the method further comprises:
carrying out node coding processing on the text to be processed to obtain a node coding vector, wherein the node coding vector is used for initializing semantic features between each word and the event;
the image coding processing on the abstract semantic representation to obtain an image embedding vector comprises:
and carrying out graph coding processing on the node coding vector and the abstract semantic representation through a graph coding model to obtain the graph embedding vector.
3. The method of claim 1, wherein the event recognition of the stitched feature vector and the outputting of the target event comprises:
determining a trigger word maximum pooling feature vector and a role word maximum pooling feature vector corresponding to the semantic embedded vector according to a dynamic maximum pooling algorithm;
splicing the maximum pooling feature vector of the trigger word and the maximum pooling feature vector of the role word with the spliced feature vector to obtain a feature vector to be identified;
and carrying out classification and identification on the characteristic vectors to be identified to obtain the target event.
4. The method of claim 1, wherein the event identifying the stitched feature vector, and outputting a target event further comprises:
performing spectral clustering on the spliced feature vectors to obtain a node clustering graph, wherein the node clustering graph comprises clustering nodes and clustering edges connected with the clustering nodes;
determining edge weight values between the clustering nodes according to the distance of the clustering edges;
carrying out graph cutting processing on the node cluster graph to obtain K cluster subgraphs, wherein K is an integer larger than 1;
and when the edge weight value of each clustering subgraph accords with a preset weight value, outputting the target event.
5. The method of claim 1, wherein the semantic encoding the abstract semantic representation to obtain a semantic embedded vector comprises:
and coding the abstract semantic representation according to a sequence coding model to obtain the semantic embedded vector, wherein the sequence coding model is used for carrying out time-sequence coding on the nodes connected through the edges.
6. The method of claim 1, prior to generating an abstract semantic representation from the text to be processed, the method further comprising:
obtaining a corpus to be processed in a database, wherein the corpus to be processed comprises M sentences, and M is an integer greater than or equal to 1;
and generating an abstract semantic representation set from the linguistic data to be processed, wherein the abstract semantic representation set comprises a node set corresponding to the sentence and an edge set formed by connecting edges among the nodes.
7. The method according to claim 6, wherein after said generating the corpus to be processed into a set of abstract semantic representations, the method further comprises:
generating a contrast learning training data set according to the abstract semantic representation set;
and pre-training a basic model according to the comparison learning training data set to obtain a training model.
8. The method of claim 7, wherein the contrast training data set comprises a contrast learning semantic training data set, and wherein generating the contrast learning training data set from the set of abstract semantic representations comprises:
constructing the semantic positive example according to the edges, the nodes, the edge set and the node set, wherein the semantic positive example is used for expressing semantic affinity between the trigger words and the roles;
performing node replacement operation on the semantic positive case to obtain a semantic negative case, wherein the semantic negative case is used for expressing the semantic distant relation between the trigger word and the role;
and taking the semantic positive examples and the semantic negative examples as a comparative learning semantic training data set.
9. The method of claim 8, wherein the training model comprises a sequence coding model, and wherein pre-training a base model according to the comparative learning training data set to obtain a training model comprises:
and carrying out reverse training on a basic sequence coding model according to the comparison learning semantic training data set and the loss function to obtain the sequence coding model.
10. The method of claim 6, wherein the comparative training data set further comprises a comparative learning graph training data set, and wherein generating a comparative learning training data set from the set of abstract semantic representations comprises:
performing random graph sampling twice on each abstract semantic representation in the abstract semantic representation set to obtain 2M random sampling subgraphs, wherein each random sampling subgraph comprises a target node and a sub-node, and each abstract semantic representation corresponds to two random sampling subgraphs;
taking 2M of the randomly sampled subgraphs as a positive graph example, wherein the positive graph example is used for representing semantic affinity of the child nodes relative to the target node;
carrying out random combination processing on the randomly sampled subgraphs to obtain X randomly combined subgraphs, wherein each randomly combined subgraph comprises two randomly sampled subgraphs, the two randomly sampled subgraphs in the randomly combined subgraphs respectively correspond to different sampling semantic representations, and X is an integer not equal to 2M;
taking X random combination subgraphs as a negative graph example, wherein the negative graph example is used for representing the semantic distancing relation of the sub-nodes relative to the target node;
and taking the positive graph example and the negative graph example as a comparative learning graph training data set.
11. The method of claim 10, wherein the random pattern sampling comprises:
taking any node in the abstract semantic representation as the target node, and taking nodes except the target node as the sub-nodes;
randomly walking from the target node to the sub-nodes to obtain a random node set, wherein the random node set comprises the target node, the sub-nodes and edges formed by walking from the target node to the sub-nodes;
drawing a node subgraph according to the random node set;
and numbering the target nodes and the sub-nodes in the node subgraph to obtain the random sampling subgraph.
12. The method of claim 10, wherein the training model comprises a graph coding model, and wherein pre-training a base model according to the comparative learning training data set to obtain a training model comprises:
and carrying out reverse training on a basic graph coding model according to the comparison learning graph training data set and the loss function to obtain the graph coding model.
13. An event extraction device, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a text to be processed, the text to be processed comprises N words, and N is an integer greater than 1;
the generating unit is used for generating abstract semantic representations according to the texts to be processed, wherein the abstract semantic representations comprise nodes which are in one-to-one correspondence with the words and edges which are used for connecting the nodes;
the processing unit is used for carrying out semantic coding processing on the abstract semantics and the text representation to be processed to obtain a semantic embedded vector, wherein the semantic embedded vector is used for representing semantic features between each word and event;
the processing unit is further configured to perform graph coding processing on the abstract semantic representation to obtain a graph embedding vector, where the graph embedding vector is used to represent structural features between the nodes connected by the edge;
the processing unit is further configured to splice the semantic embedded vector and the graph embedded vector to obtain a spliced feature vector;
and the identification unit is used for identifying the spliced feature vector and outputting a target event, wherein the target event comprises trigger words and role words extracted from the N words, the trigger words are used for indicating events occurring in the text to be processed, and the role words are used for indicating roles of all entities in the text to be processed in the events.
14. A computer device, comprising: a memory, a transceiver, a processor, and a bus system;
wherein the memory is used for storing programs;
the processor, when executing the program in the memory, implementing the method of any of claims 1 to 12;
the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.
15. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any of claims 1 to 12.
CN202110546916.7A 2021-05-19 2021-05-19 Event extraction method, related device, equipment and storage medium Pending CN113761122A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110546916.7A CN113761122A (en) 2021-05-19 2021-05-19 Event extraction method, related device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110546916.7A CN113761122A (en) 2021-05-19 2021-05-19 Event extraction method, related device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113761122A true CN113761122A (en) 2021-12-07

Family

ID=78787124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110546916.7A Pending CN113761122A (en) 2021-05-19 2021-05-19 Event extraction method, related device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113761122A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114528444A (en) * 2022-02-25 2022-05-24 北京百度网讯科技有限公司 Graph data processing method and device, electronic equipment and storage medium
CN116628210A (en) * 2023-07-24 2023-08-22 广东美的暖通设备有限公司 Fault determination method for intelligent building fault event extraction based on comparison learning
CN116992854A (en) * 2023-04-25 2023-11-03 云南大学 Text abstract generation method based on AMR (automatic dependent memory) contrast learning
WO2024074099A1 (en) * 2022-10-04 2024-04-11 阿里巴巴达摩院(杭州)科技有限公司 Model training method and apparatus, text processing method and apparatus, device, and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114528444A (en) * 2022-02-25 2022-05-24 北京百度网讯科技有限公司 Graph data processing method and device, electronic equipment and storage medium
CN114528444B (en) * 2022-02-25 2023-02-03 北京百度网讯科技有限公司 Graph data processing method and device, electronic equipment and storage medium
WO2024074099A1 (en) * 2022-10-04 2024-04-11 阿里巴巴达摩院(杭州)科技有限公司 Model training method and apparatus, text processing method and apparatus, device, and storage medium
CN116992854A (en) * 2023-04-25 2023-11-03 云南大学 Text abstract generation method based on AMR (automatic dependent memory) contrast learning
CN116992854B (en) * 2023-04-25 2024-07-23 云南大学 Text abstract generation method based on AMR (automatic dependent memory) contrast learning
CN116628210A (en) * 2023-07-24 2023-08-22 广东美的暖通设备有限公司 Fault determination method for intelligent building fault event extraction based on comparison learning
CN116628210B (en) * 2023-07-24 2024-03-19 广东美的暖通设备有限公司 Fault determination method for intelligent building fault event extraction based on comparison learning

Similar Documents

Publication Publication Date Title
CN109145303B (en) Named entity recognition method, device, medium and equipment
CN111553162B (en) Intention recognition method and related device
CN108280458B (en) Group relation type identification method and device
CN111428516B (en) Information processing method and device
CN113761122A (en) Event extraction method, related device, equipment and storage medium
KR20210076110A (en) Methods for finding image regions, model training methods and related devices
WO2019047971A1 (en) Image recognition method, terminal and storage medium
CN111816159B (en) Language identification method and related device
CN111539212A (en) Text information processing method and device, storage medium and electronic equipment
CN113821589B (en) Text label determining method and device, computer equipment and storage medium
CN111597804B (en) Method and related device for training entity recognition model
CN110570840A (en) Intelligent device awakening method and device based on artificial intelligence
CN112749252B (en) Text matching method and related device based on artificial intelligence
CN114722937B (en) Abnormal data detection method and device, electronic equipment and storage medium
CN111651604A (en) Emotion classification method based on artificial intelligence and related device
CN115114318A (en) Method and related device for generating database query statement
CN114840499B (en) Method, related device, equipment and storage medium for generating table description information
CN114328908A (en) Question and answer sentence quality inspection method and device and related products
CN112488157B (en) Dialogue state tracking method and device, electronic equipment and storage medium
CN117828523A (en) Multi-mode emotion analysis method and device, electronic equipment and storage medium
CN112328783A (en) Abstract determining method and related device
CN113505596A (en) Topic switching marking method and device and computer equipment
CN113821609A (en) Answer text acquisition method and device, computer equipment and storage medium
CN113569043A (en) Text category determination method and related device
CN114840563A (en) Method, device, equipment and storage medium for generating field description information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination