CN113887213A - Event detection method and device based on multilayer graph attention network - Google Patents

Event detection method and device based on multilayer graph attention network Download PDF

Info

Publication number
CN113887213A
CN113887213A CN202111164755.1A CN202111164755A CN113887213A CN 113887213 A CN113887213 A CN 113887213A CN 202111164755 A CN202111164755 A CN 202111164755A CN 113887213 A CN113887213 A CN 113887213A
Authority
CN
China
Prior art keywords
information
context
vector
word
syntactic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111164755.1A
Other languages
Chinese (zh)
Inventor
包先雨
吴共庆
何俐娟
柯培超
陆振亚
王歆
程立勋
蔡伊娜
郑文丽
慕容灏鼎
蔡屹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Shenzhen Academy of Inspection and Quarantine
Original Assignee
Hefei University of Technology
Shenzhen Academy of Inspection and Quarantine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology, Shenzhen Academy of Inspection and Quarantine filed Critical Hefei University of Technology
Priority to CN202111164755.1A priority Critical patent/CN113887213A/en
Priority to PCT/CN2021/123249 priority patent/WO2023050470A1/en
Publication of CN113887213A publication Critical patent/CN113887213A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application provides an event detection method and device based on a multilayer graph attention network, which comprises the steps of obtaining context words in event text information, and determining a syntactic information adjacent matrix and a splicing vector corresponding to the context words; taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector; generating aggregation information according to the splicing vector and the output vector in an aggregation mode; and determining the trigger word category of the context word according to the aggregation information. By simultaneously combining the syntactic information and the context information of the context words, the method effectively solves the problems that information is easy to lose and errors are easy to propagate by using a syntactic analysis tool; and by combining the jump connection module in the attention network layer, the situation that the final trigger word classification is not ideal due to excessive propagation of some short-distance syntax information is avoided, and the precision, recall rate and F1 value of the trigger word classification are effectively improved.

Description

Event detection method and device based on multilayer graph attention network
Technical Field
The present application relates to the field of natural language processing, and in particular, to an event detection method and apparatus based on a multi-layer graph attention network.
Background
The Knowledge Graph (Knowledge Graph) describes concepts, entities and relations in an objective world in a structured form, expresses information of the internet into a form closer to a human cognitive world, and provides the capability of better organizing, managing and understanding mass information of the internet. The knowledge graph is proposed by google in 2012 and successfully applied to a search engine, belongs to the important research field of artificial intelligence, namely the research category of knowledge engineering, and is a killer mace application for establishing large-scale knowledge resources by using knowledge engineering. Typical examples are the knowledge graph introduced in 2012 after google purchased Freebase (a free knowledge database), graph search of Facebook (social network service website), Microsoft Satori (Microsoft) and domain specific knowledge bases of business, finance, life sciences, etc.
The event knowledge in the knowledge graph is hidden in the internet resources and comprises the existing structured semantic knowledge, the structured information of a database, the semi-structured information resources and the unstructured resources, and the resources with different properties have different knowledge acquisition methods. Identification and extraction of events it is investigated how event information can be identified and extracted from text describing the event information and presented in a structured form, including the time, place, participation role at which it occurred and the change in action or state associated with it.
The traditional event detection method omits syntactic characteristics contained between words in a sentence, only utilizes characteristics at the sentence level, and the event detection is easy to cause low recognition efficiency and classification precision of trigger words due to ambiguity of words. In recent years, however, a method of improving event detection using syntax information has proven to be effective. For example, the thesis "no trigger event detection method of merging syntactic information" proposes to use syntactic information and combine ATTENTION mechanism (ATTENTION) to realize the connection of scattered event information in sentences to improve the accuracy of event detection; the paper Vietnamese news event detection integrating dependency information and a convolutional neural network utilizes the characteristics among convolutional coding non-continuous words integrating dependency syntax information and then integrates the characteristics of the two parts as event codes, thereby realizing the event detection.
Disclosure of Invention
In view of the above, the present application is proposed to provide a method for event detection based on a multi-layer graph attention network that overcomes or at least partially solves the above mentioned problems, comprising the steps of:
an event detection method based on a multilayer graph attention network comprises the following steps:
obtaining context words in event text information, and determining a syntactic information adjacency matrix and a splicing vector corresponding to the context words;
taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector;
generating aggregation information according to the splicing vector and the output vector in an aggregation mode;
and determining the trigger word category of the context word according to the aggregation information.
Further, the step of obtaining a context word in the event text information and determining a syntactic information adjacency matrix and a concatenation vector corresponding to the context word includes:
determining syntactic information corresponding to the context words according to the context words;
generating the syntax information adjacency matrix according to the syntax information;
and generating the splicing vector according to the word embedding vector of the context word.
Further, the step of determining syntactic information corresponding to the context word according to the context word includes:
and analyzing the event text information through syntactic dependency, and generating syntactic information corresponding to the context word according to the analysis result of the event text information.
Further, the step of obtaining an output vector by using the adjacency matrix and the splicing vector as input of the artificial neural network includes:
generating a tensor by the adjacent matrixes in the same batch;
and inputting the tensor and the splicing vector into an artificial neural network for calculation, and generating the output vector according to the calculation result of the artificial neural network.
Further, the step of determining the trigger word category of the context word according to the aggregation information includes:
determining a trigger word of the context word according to the aggregation information, and classifying the trigger word according to a classifier module.
An event detection apparatus based on a multi-layer graph attention network, comprising:
the system comprises an acquisition module, a judgment module and a display module, wherein the acquisition module is used for acquiring context words in event text information and determining a syntactic information adjacent matrix and a splicing vector corresponding to the context words;
the computing module is used for taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector;
the aggregation module is used for generating aggregation information according to the splicing vector and the output vector in an aggregation mode;
and the classification module is used for determining the trigger word category of the context word according to the aggregation information.
Further, the obtaining module includes:
the expression submodule is used for determining syntactic information corresponding to the context words according to the context words;
a generating submodule, configured to generate the syntax information adjacency matrix according to the syntax information;
and the splicing submodule is used for generating the splicing vector according to the word embedding vector of the context word.
Further, the expression submodule comprises:
and the dependency analysis submodule is used for analyzing the event text information through syntactic dependency and generating syntactic information corresponding to the context word according to the analysis result of the event text information.
Further, the calculation module includes:
the array conversion submodule is used for generating a tensor by the adjacent matrixes of the same batch;
and the artificial neural network calculation submodule is used for inputting the tensor and the splicing vector into an artificial neural network for calculation and generating the output vector according to the result of the artificial neural network calculation.
Further, the classification module includes:
and the trigger word processing submodule is used for determining the trigger words of the context words according to the aggregation information and classifying the trigger words according to the classifier module.
The application has the following advantages:
in the embodiment of the application, context words in event text information are obtained, and a syntactic information adjacency matrix and a splicing vector corresponding to the context words are determined; taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector; generating aggregation information according to the splicing vector and the output vector in an aggregation mode; and determining the trigger word category of the context word according to the aggregation information. By simultaneously combining the syntactic information and the context information of the context words, the method can effectively solve the problems that information is easy to lose and errors are easy to propagate by using a syntactic analysis tool; and by combining the jump connection module in the attention network layer, more original characteristics can be reserved, the situation that the classification of final trigger words is not ideal due to excessive propagation of some short-distance syntactic information is avoided, and the precision, the recall rate and the F1 value of the classification of the trigger words are effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings needed to be used in the description of the present application will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.
FIG. 1 is a flowchart illustrating steps of an event detection method based on a multi-layer graph attention network according to an embodiment of the present application;
FIG. 2 is a diagram of a syntactic dependency tree provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a adjacency matrix provided by an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating an attention network provided by an embodiment of the present application;
fig. 5 is a flowchart illustrating an event detection method based on a multi-layer graph attention network according to an embodiment of the present application;
fig. 6 is a block diagram of an event detection apparatus based on a multi-layer graph attention network according to an embodiment of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, an event detection method based on a multi-layer graph attention network according to an embodiment of the present application is shown;
the method comprises the following steps:
s110, obtaining context words in event text information, and determining a syntactic information adjacent matrix and a splicing vector corresponding to the context words;
s120, taking the adjacency matrix and the splicing vector as input of an artificial neural network to obtain an output vector;
s130, generating aggregation information according to the splicing vector and the output vector in an aggregation mode;
s140, determining the trigger word category of the context word according to the aggregation information.
In the embodiment of the application, context words in event text information are obtained, and a syntactic information adjacency matrix and a splicing vector corresponding to the context words are determined; taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector; generating aggregation information according to the splicing vector and the output vector in an aggregation mode; and determining the trigger word category of the context word according to the aggregation information. By simultaneously combining the syntactic information and the context information of the context words, the method can effectively solve the problems that information is easy to lose and errors are easy to propagate by using a syntactic analysis tool; and by combining the jump connection module in the attention network layer, more original characteristics can be reserved, the situation that the classification of final trigger words is not ideal due to excessive propagation of some short-distance syntactic information is avoided, and the precision, the recall rate and the F1 value of the classification of the trigger words are effectively improved.
Hereinafter, the event detection method based on the multi-layer graph attention network in the present exemplary embodiment will be further described.
As described in step S110, a context word in the event text information is obtained, and a syntactic information adjacency matrix and a concatenation vector corresponding to the context word are determined.
In an embodiment of the present application, a specific process of "obtaining context words in the event text information and determining the syntactic information adjacency matrix and the concatenation vector corresponding to the context words" in step S110 may be further described in conjunction with the following description.
Determining syntactic information corresponding to the context word according to the context word as described in the following steps;
in an embodiment of the present application, the specific process of "determining syntax information corresponding to the context word according to the context word" may be further described in conjunction with the following description.
And analyzing the event text information through syntactic dependency, and generating syntactic information corresponding to the context word according to the analysis result of the event text information.
The syntactic dependency is a structure that reveals the syntactic structure by analyzing the dependency relationship between components in a language unit, and the syntactic dependency analysis recognizes grammatical components such as "a predicate object" and "a shape complement" in a sentence, and emphasizes the relationship between analyzed words. The core of the sentence in the syntactic dependency analysis is a predicate verb, then other components are found around the predicate, and finally the sentence is analyzed into a syntactic dependency tree. A syntactic dependency tree may describe dependencies between words.
In a specific implementation, event text information is acquired, the event text information is identified, syntactic dependency analysis is performed by using a Stanford Core NLP (Standard Language Processing, Stanford Natural Language Processing tool), each sentence in the event text is analyzed, an event trigger in the sentence is identified, and the dependency relationship between the event trigger and an event parameter and/or between the event parameter and the event parameter is emphasized and analyzed to form a syntactic dependency tree.
The event trigger word is a word which can represent the occurrence of an event most in the event, is the projection of an event concept on the word and phrase level, is the basis and the recourse of event identification, and is also an important characteristic for determining the event category, generally a verb or a noun; the event parameter refers to information describing the time, place, and person of the event occurrence.
Referring to FIG. 2, a diagram of a syntactic dependency tree provided by an embodiment of the present application is shown. As shown in the figure, the sentence "I go to Beijing Tiananmen to see the rising of the sun", in the constructed syntactic dependency tree, the core predicate of the sentence is "go", which is the root of the syntactic dependency tree, the subject of the "go" is "I", the object of the "go" is "Beijing Tiananmen", the object of the other "go" is "sun", and the syntactic dependency tree can describe the dependency relationship between context words.
Generating the syntax information adjacency matrix according to the syntax information as described in the following steps;
the adjacency matrix is a matrix representing the adjacency relationship between vertices. Let G ═ (V, E) be a figure, where V ═ V1,v2,…,vnV is a vertex, E is an edge, a one-dimensional array is used for storing data of all the vertices in the graph, and a two-dimensional array is used for storing data of the relationship (the edge or the arc) between the vertices, and the two-dimensional array is called as an adjacent matrix. The adjacency matrices are further divided into directed graph adjacency matrices and undirected graph adjacency matrices. The adjacency matrix of G is an n-th order square matrix having the following properties: for an undirected graph, the adjacency matrix must be symmetric, and the major diagonal is zero, the minor diagonal is not necessarily 0, which is not necessary for a directed graph. In an undirected graph, the degree of any vertex i is the ith column (orRow i) is the number of all non-zero elements in row i, the out-degree of a vertex i in the directed graph is the number of all non-zero elements in row i, the in-degree is the number of all non-zero elements in column i, and the syntactic dependency relationship between two event parameters is stored by adopting an adjacent matrix of the directed graph.
As an example, each sentence is based on a syntactic dependency tree formed by syntactic dependency analysis, and a corresponding adjacency matrix is generated based on the syntactic dependency tree.
In a specific implementation, referring to fig. 3, a schematic diagram of an adjacency matrix provided by an embodiment of the present application is shown, where the adjacency matrix shown in fig. 3 corresponds to the syntactic dependency tree shown in fig. 2. The trigger words "go", "Beijing" and "Tiananmen" in FIG. 2 are parallel objects, so that the value of the intersection position of the row where "go" is located and the column value where "Beijing" and "Tiananmen" are located in the corresponding adjacent matrix is 1. Each word is taken as a node, and the words are seven words, namely 'I', 'go', 'Beijing', 'Tiananmen', 'watch', 'sun' and 'rise', so that the seven words are a 7X7 square matrix. If syntactic arcs exist between the two words, the corresponding position of the matrix is 1, and if the syntactic arcs exist between the two words, the corresponding position of the matrix is 0. And performing syntax dependence relationship of the stored text by adopting an adjacent matrix of the directed graph, wherein if the dependence relationship exists between the words, the value of the corresponding adjacent matrix element is 1, and the value of the corresponding adjacent matrix element is 0 between the words without the dependence relationship. Dependencies between the context words can be represented by the adjacency matrix.
The stitching vector is generated from the word-embedded vectors of the context words, as described in the following steps.
It should be noted that, the word-level information in the sentence needs to be converted into a real-valued vector as an input of the artificial neural network. Let X ═ { X1, X2, X3, …, xn } be a sentence of length n, where xi is the ith word in the sentence. In the natural language processing task, semantic information of words is related to positions of the words in sentences, and part of speech and entity type information have an effect of promoting recognition of triggering words and understanding of semantics. The method and the device take a spliced vector formed by splicing a sense vector, an entity vector, a part of speech vector and a position vector of a context word as the input of the artificial neural network.
In a specific implementation, 4 different word embedding vectors including a sense vector, an entity vector, a part of speech vector and a position vector of the context word are spliced into a first spliced vector, the first spliced vector is input into a Bi-LSTM neural network layer to generate a second spliced vector, the second spliced vector is used as one of input vectors of a multilayer graph attention network, and the spliced vector can acquire semantic information among the context words.
And step S120, taking the adjacency matrix and the splicing vector as input of an artificial neural network, and obtaining an output vector.
It should be noted that the artificial neural network is a multi-layer Graph Attention network (Graph Attention Networks). The conventional graph convolution network has various limitations, so that the conventional graph convolution network cannot well process directed graphs, cannot be suitable for induced tasks (the induced tasks refer to that graph structures needing to be processed in a training stage and a testing stage are different), and cannot process dynamic graphs. Under the graph attention network, even if the structure of the graph is changed in the prediction process, the influence on the graph attention network is not large, and only the parameters need to be adjusted and the calculation needs to be carried out again. The operation mode of the graph attention network is the operation of vertex by vertex, and each operation needs to be completed by circularly traversing all the vertices on the graph. The vertex-by-vertex operation means that the constraint of Laplacian matrix (Laplacian matrix) in the original graph structure is removed, so that the problem of the directed graph is solved easily.
In an embodiment of the present application, a specific process of "taking the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain the output vector" in step S120 may be further described with reference to the following description.
Generating a tensor from the adjacent matrixes of the same batch according to the following steps;
in one specific implementation, sentences identified at the same time in the event text information are in one batch, the adjacency matrixes of the sentences in the same batch form a tensor, and the adjacency matrix set is expressed as
Figure BDA0003290996920000081
The formation tensor is expressed as A ∈ RN*N*KWherein K ═ TVAnd | N is the number of nodes.
And inputting the tensor and the splicing vector into an artificial neural network for calculation, and generating the output vector according to the calculation result of the artificial neural network.
As an example, referring to fig. 4, a schematic diagram illustrating an attention network provided in an embodiment of the present application is shown, which is divided into two steps of calculating an attention coefficient and weighting and summing. The tensor and the second stitching vector are used as input of the attention layer of the graph and are expressed as
Figure BDA0003290996920000082
Wherein N is the number of nodes, and F is the number of node features; output is as
Figure BDA0003290996920000083
Where F' represents the new node feature vector dimension. And calculating the attention coefficient of the node i and the peripheral neighbor node j epsilon Ni, wherein the calculation formula is represented as follows as shown in the left side of the figure 4:
Figure BDA0003290996920000091
wherein a is an RF′×RF′Mapping of → R, W ∈ RF′×FIs a weight matrix.
The graph attention network can use an attention mechanism to calculate the similarity coefficient weight of the node i and the adjacent node j for each node, so that the graph structure is not completely relied on.
The attention coefficient is normalized by softmax, and the calculation formula is as follows:
Figure BDA0003290996920000092
where, | | represents vector concatenation, eijAnd alphaijAre all called "attention coefficient", alphaijIs at eijAnd carrying out normalization on the basis.
After the attention coefficients of all the nodes are normalized, the features of the neighbor nodes are subjected to weighted summation to generate an output vector, and the calculation formula is expressed as follows:
Figure BDA0003290996920000093
wherein W is a weight matrix multiplied by the features, σ is a nonlinear activation function, j ∈ NiJ traversed in (a) represents all nodes adjacent to i.
As shown on the right side of fig. 4, for a three-tier graph attention network, the multi-tier attention mechanism assigns different attention weights to different features. For a multi-layer graph attention network, the calculation formula is as follows:
Figure BDA0003290996920000101
if the multi-layer graph attention network is applied to the output layer, the calculation formula is expressed as follows:
Figure BDA0003290996920000102
as stated in step S130, aggregate information is generated according to the concatenation vector and the output vector.
As an example, in the attention network of each layer, aggregation of syntax information is implemented by a Skip-Connection module (Skip-Connection), and the concatenation vector is skipped over the attention network of each layer by the Skip-Connection module and is subjected to an aggregation operation with the output vector. The short-distance syntax information can be prevented from being excessively transmitted through the jump connection module, more original syntax information can be reserved, and the condition that the final trigger word classification effect is poor is avoided.
As stated in step S140, the trigger word category of the context word is determined according to the aggregation information.
In an embodiment of the present application, the specific process of "determining the trigger category of the context word according to the aggregation information" in step S140 may be further described in conjunction with the following description.
Determining a trigger word of the context word according to the aggregation information, and classifying the trigger word according to a classifier module.
As an example, determining a trigger word of the context word according to the aggregation information, classifying the trigger word through a preset condition of a classifier module, and determining an event type corresponding to an event sentence according to a classification category of the trigger word. The event types are different types which are defined in advance.
Specifically, the preset condition of the classifier module is to aggregate information of different modules, pass through a full connection layer, and then select the category with the maximum category probability from the category probabilities corresponding to each context word as the label of the current trigger word prediction by using a softmax function (the softmax function maps the outputs of a plurality of neurons into a (0, 1) interval, which can be regarded as probability to be understood, thereby performing multi-classification).
The following experimental demonstration is performed on the event detection method based on the multi-layer graph attention network provided by the embodiment of the application:
the experimental environment is as follows: the system comprises a Pythrch-1.8.0 (an open source Python machine learning library), an Nvidia GeForce RTX 3060 (a display card chip), Windows 10 (a computer operating system), Inter i7-11700k, a memory 16G and a hard disk 1T.
The experimental data are shown in table 1:
TABLE 1 comparative results of the experiments
Figure BDA0003290996920000111
The experimental results are as follows: the experiment used Precision (Precision, P), Recall (Recall, R), F1 value (F1-score) as the observed variables, P, R, F1 is defined as follows:
Figure BDA0003290996920000112
Figure BDA0003290996920000113
Figure BDA0003290996920000121
in order to ensure the accuracy of the experiment, the division of the data set in the experiment is consistent with that of the data sets of other event detection methods, and the experimental result proves that compared with the traditional event detection method only utilizing sentence-level features, the F1-score of the event detection method provided by the embodiment is about 8% higher; compared with the method based on the graph neural network, the event method provided by the embodiment also achieves the highest value on F1-score and Recall.
Referring to fig. 5, a flow diagram of an event detection method based on a multi-layer graph attention network is shown;
in a specific implementation, after event text information is acquired, analyzing the event text information through a syntactic analysis technology to generate a syntactic dependency tree, generating an adjacent matrix corresponding to the context word according to the syntactic dependency tree, and generating a tensor from the adjacent matrix of sentences in the same batch; splicing the embedded vectors of 4 different words of the context word into a first spliced vector, inputting the first spliced vector into a Bi-LSTM neural network layer to generate a second spliced vector, and inputting the adjacency matrix and the second spliced vector into a multilayer graph attention network to generate an output vector so as to perform aggregation operation on syntactic information of different depths; the splicing vector is subjected to aggregation operation by skipping a multilayer graph attention network through a skip connection module; and aggregating the output vector and the spliced vector, classifying the trigger words of the context words through a classifier module, and determining the event type corresponding to the event sentence.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Referring to fig. 6, an event detection apparatus based on a multi-layer graph attention network according to an embodiment of the present application is shown;
the method specifically comprises the following steps:
an obtaining module 610, configured to obtain a context word in event text information, and determine a syntactic information adjacency matrix and a concatenation vector corresponding to the context word;
a calculating module 620, configured to use the adjacency matrix and the splicing vector as inputs of an artificial neural network, and obtain an output vector;
an aggregation module 630, configured to generate aggregation information according to aggregation of the stitching vector and the output vector;
and the classification module 640 is configured to determine a trigger word category of the context word according to the aggregation information.
In an embodiment of the present application, the obtaining module 610 includes:
the expression submodule is used for determining syntactic information corresponding to the context words according to the context words;
a generating submodule, configured to generate the syntax information adjacency matrix according to the syntax information;
and the splicing submodule is used for generating the splicing vector according to the word embedding vector of the context word.
In an embodiment of the present application, the expression submodule includes:
and the dependency analysis submodule is used for analyzing the event text information through syntactic dependency and generating syntactic information corresponding to the context word according to the analysis result of the event text information.
In an embodiment of the present application, the calculating module 620 includes:
the array conversion submodule is used for generating a tensor by the adjacent matrixes of the same batch;
and the artificial neural network calculation submodule is used for inputting the tensor and the splicing vector into an artificial neural network for calculation and generating the output vector according to the result of the artificial neural network calculation.
In an embodiment of the present application, the classification module 640 includes:
and the trigger word processing submodule is used for determining the trigger words of the context words according to the aggregation information and classifying the trigger words according to the classifier module.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The event detection method and device based on the multi-layer graph attention network provided by the application are introduced in detail, a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the above embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. An event detection method based on a multilayer graph attention network is characterized by comprising the following steps:
obtaining context words in event text information, and determining a syntactic information adjacency matrix and a splicing vector corresponding to the context words;
taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector;
generating aggregation information according to the splicing vector and the output vector in an aggregation mode;
and determining the trigger word category of the context word according to the aggregation information.
2. The method for detecting events based on the multi-layer graph attention network of claim 1, wherein the step of obtaining context words in the event text information and determining the syntactic information adjacency matrix and the concatenation vector corresponding to the context words comprises:
determining syntactic information corresponding to the context words according to the context words;
generating the syntax information adjacency matrix according to the syntax information;
and generating the splicing vector according to the word embedding vector of the context word.
3. The method for detecting events based on multilayer graph attention network of claim 2, wherein the step of determining syntactic information corresponding to the context word according to the context word comprises:
and analyzing the event text information through syntactic dependency, and generating syntactic information corresponding to the context word according to the analysis result of the event text information.
4. The event detection method based on the multilayer graph attention network according to claim 1, wherein the step of taking the adjacency matrix and the splicing vector as input of the artificial neural network to obtain an output vector comprises:
generating a tensor by the adjacent matrixes in the same batch;
and inputting the tensor and the splicing vector into an artificial neural network for calculation, and generating the output vector according to the calculation result of the artificial neural network.
5. The method for detecting events based on multi-layer graph attention network of claim 1, wherein the step of determining the trigger word class of the context word according to the aggregation information comprises:
determining a trigger word of the context word according to the aggregation information, and classifying the trigger word according to a classifier module.
6. An event detection device based on a multilayer graph attention network, comprising:
the system comprises an acquisition module, a judgment module and a display module, wherein the acquisition module is used for acquiring context words in event text information and determining a syntactic information adjacent matrix and a splicing vector corresponding to the context words;
the computing module is used for taking the adjacency matrix and the splicing vector as the input of an artificial neural network to obtain an output vector;
the aggregation module is used for generating aggregation information according to the splicing vector and the output vector in an aggregation mode;
and the classification module is used for determining the trigger word category of the context word according to the aggregation information.
7. The event detection device based on the multilayer graph attention network of claim 6, wherein the obtaining module comprises:
the expression submodule is used for determining syntactic information corresponding to the context words according to the context words;
a generating submodule, configured to generate the syntax information adjacency matrix according to the syntax information;
and the splicing submodule is used for generating the splicing vector according to the word embedding vector of the context word.
8. The event detection device based on the multi-layer graph attention network of claim 7, wherein the expression submodule comprises:
and the dependency analysis submodule is used for analyzing the event text information through syntactic dependency and generating syntactic information corresponding to the context word according to the analysis result of the event text information.
9. The event detection device based on the multilayer graph attention network of claim 6, wherein the calculation module comprises:
the array conversion submodule is used for generating a tensor by the adjacent matrixes of the same batch;
and the artificial neural network calculation submodule is used for inputting the tensor and the splicing vector into an artificial neural network for calculation and generating the output vector according to the result of the artificial neural network calculation.
10. The event detection device based on the multi-layer graph attention network of claim 6, wherein the classification module comprises:
and the trigger word processing submodule is used for determining the trigger words of the context words according to the aggregation information and classifying the trigger words according to the classifier module.
CN202111164755.1A 2021-09-30 2021-09-30 Event detection method and device based on multilayer graph attention network Pending CN113887213A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111164755.1A CN113887213A (en) 2021-09-30 2021-09-30 Event detection method and device based on multilayer graph attention network
PCT/CN2021/123249 WO2023050470A1 (en) 2021-09-30 2021-10-12 Event detection method and apparatus based on multi-layer graph attention network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111164755.1A CN113887213A (en) 2021-09-30 2021-09-30 Event detection method and device based on multilayer graph attention network

Publications (1)

Publication Number Publication Date
CN113887213A true CN113887213A (en) 2022-01-04

Family

ID=79005069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111164755.1A Pending CN113887213A (en) 2021-09-30 2021-09-30 Event detection method and device based on multilayer graph attention network

Country Status (2)

Country Link
CN (1) CN113887213A (en)
WO (1) WO2023050470A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303996B (en) * 2023-05-25 2023-08-04 江西财经大学 Theme event extraction method based on multifocal graph neural network
CN116629237B (en) * 2023-07-25 2023-10-10 江西财经大学 Event representation learning method and system based on gradually integrated multilayer attention
CN116701576B (en) * 2023-08-04 2023-10-10 华东交通大学 Event detection method and system without trigger words

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11132513B2 (en) * 2019-05-07 2021-09-28 International Business Machines Corporation Attention-based natural language processing
CN111259142B (en) * 2020-01-14 2020-12-25 华南师范大学 Specific target emotion classification method based on attention coding and graph convolution network
CN112163416B (en) * 2020-10-09 2021-11-02 北京理工大学 Event joint extraction method for merging syntactic and entity relation graph convolution network
CN112347248A (en) * 2020-10-30 2021-02-09 山东师范大学 Aspect-level text emotion classification method and system

Also Published As

Publication number Publication date
WO2023050470A1 (en) 2023-04-06

Similar Documents

Publication Publication Date Title
Dahouda et al. A deep-learned embedding technique for categorical features encoding
CN107066446B (en) Logic rule embedded cyclic neural network text emotion analysis method
Ma et al. Label embedding for zero-shot fine-grained named entity typing
US20200364253A1 (en) Method and system for analyzing entities
Al-Azani et al. Hybrid deep learning for sentiment polarity determination of Arabic microblogs
Aggarwal et al. Classification of fake news by fine-tuning deep bidirectional transformers based language model
CN113887213A (en) Event detection method and device based on multilayer graph attention network
CN109766557B (en) Emotion analysis method and device, storage medium and terminal equipment
US20200160196A1 (en) Methods and systems for detecting check worthy claims for fact checking
Goel et al. Sarcasm detection using deep learning and ensemble learning
CN112100401B (en) Knowledge graph construction method, device, equipment and storage medium for science and technology services
CN116304748B (en) Text similarity calculation method, system, equipment and medium
CN110705255A (en) Method and device for detecting association relation between sentences
Alexandridis et al. A knowledge-based deep learning architecture for aspect-based sentiment analysis
CN115017916A (en) Aspect level emotion analysis method and device, electronic equipment and storage medium
Truică et al. MCWDST: a minimum-cost weighted directed spanning tree algorithm for real-time fake news mitigation in social media
O'Keefe et al. Deep learning and word embeddings for tweet classification for crisis response
CN116521899B (en) Improved graph neural network-based document level relation extraction method and system
Wakchaure et al. A scheme of answer selection in community question answering using machine learning techniques
CN116257632A (en) Unknown target position detection method and device based on graph comparison learning
KR102567896B1 (en) Apparatus and method for religious sentiment analysis using deep learning
CN115827865A (en) Method and system for classifying objectionable texts by fusing multi-feature map attention mechanism
CN115129863A (en) Intention recognition method, device, equipment, storage medium and computer program product
WO2023077562A1 (en) Graph perturbation strategy-based event detection method and apparatus
Nguyen et al. A model of convolutional neural network combined with external knowledge to measure the question similarity for community question answering systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination