CN111160027A - Cyclic neural network event time sequence relation identification method based on semantic attention - Google Patents

Cyclic neural network event time sequence relation identification method based on semantic attention Download PDF

Info

Publication number
CN111160027A
CN111160027A CN201911335582.8A CN201911335582A CN111160027A CN 111160027 A CN111160027 A CN 111160027A CN 201911335582 A CN201911335582 A CN 201911335582A CN 111160027 A CN111160027 A CN 111160027A
Authority
CN
China
Prior art keywords
vector
event
trigger word
trigger
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911335582.8A
Other languages
Chinese (zh)
Inventor
徐小良
高通
王宇翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201911335582.8A priority Critical patent/CN111160027A/en
Publication of CN111160027A publication Critical patent/CN111160027A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for identifying a time sequence relation of a recurrent neural network event based on semantic attention, which mainly comprises the following steps: firstly, carrying out syntactic dependency relationship analysis on an input event sentence, intercepting related trigger word meaning dependency branches, and then acquiring a corresponding hidden state vector by using a recurrent neural network; then, calculating attention weight vectors except for the trigger words, fusing different participles according to different weights, and splicing the participles with the trigger words to obtain event sentence state vectors; finally, the event sentence state vector is put intosoftmaxAnd predicting the time sequence relation in the function. The method can effectively capture the semantic information hidden in the event sentence, and can effectively associate and fuse different participles, thereby improving the identification precision of the event time sequence relation.

Description

Cyclic neural network event time sequence relation identification method based on semantic attention
Technical Field
The invention relates to the field of natural language processing, in particular to a recurrent neural network event time sequence relation identification method based on semantic attention.
Background
Events have received much attention in the field of natural language processing as an important form of knowledge representation. An event is a set of related descriptions about a subject that objectively characterize what a particular subject (person or persons and objects) happens in a detailed time and place environment as an important way of conveying information. The event time sequence relation refers to the chronological sequence relation of events when the events occur, is a semantic relation among the events, and connects the evolution process of a certain subject event from the beginning to the end and the mutual relation of the events. An example of event timing relationship identification (taken from the Timebank-detect corpus) is listed below.
Event sentence 1: conseco Inc. sand it is sealing for the reconstruction on Dec 7 of the 800000remaining shares.
Event sentence 2: the actual center all conversion rights on The stockwell terminate on Nov 30.
In the above example, there are two events, "catching" and "terminate," which are preceded by a timing relationship, i.e., the "catching" event occurs before the "terminate" event. The goal of event timing relationship identification is to accurately identify the timing relationship of related events according to a given corpus.
In the previous research work, the most common method for determining the time sequence relationship of events is a pattern matching method, and the method identifies the relationship between events by the relationship between event triggering words through matching the event relationship pairs in the text according to the manually defined template. The trigger is a predicate for identifying an event, and there are many common verbs and nouns. However, the manually defined event relationship template is limited by the data format or content, and is also prone to low recall rate. In addition, the template construction usually has certain field limitations, different fields make different templates, and a universal event relation template cannot be applied to various different types of events. With the establishment of a corpus and a knowledge base, a plurality of research works begin to introduce a machine learning method to carry out the research of event timing, the basic idea is to obtain the dependency relationship and entity labels among each participle in a sentence by additionally analyzing the syntax and lexical characteristics of related sentences, and finally put the participle into classifiers such as an SVM (support vector machine) for classification, but the accuracy rate of the obtained timing relationship is low and is only more than 40%. With the rapid development of deep learning technology, some researchers apply models such as CNN and RNN to event timing sequence recognition, and the effect is further improved. Subsequently, some researchers have applied semantic dependencies to construct the input vector representation by intercepting the shortest dependency path associated with the trigger. However, the method only intercepts the unidirectional branches related to the trigger words, neglects the information of partial neighbors of the trigger words, and may cause the omission of some important semantic information.
Through analysis, the semantic information implicit in the event sentence is difficult to capture by the methods, and effective connection and information fusion are lacked among different participles, so that the recognition accuracy is not ideal.
Disclosure of Invention
The invention provides a recurrent neural network event time sequence relation recognition method based on a semantic attention mechanism, and aims to solve the problems that semantic information implicit in an event sentence is difficult to capture and effective connection and fusion information among different participles are lacked in the conventional event time sequence relation recognition method.
The technical scheme of the invention is as follows:
step 1: and constructing a trigger word sense dependent branch. Trigger is a predicate used to identify an event, and there are many verbs and nouns in general. Firstly, carrying out syntactic dependency relationship analysis on an input event sentence to obtain a complete dependency syntactic tree, searching the position of a trigger word, and searching a father node and a brother node of the trigger word until a root node is finished; if the trigger is not a leaf node, its child nodes are recursively searched downward from the trigger position. And combining the two parts of information to form a trigger word meaning dependence branch. Each participle in the trigger word sense dependent branch has three corresponding vectors, namely a word vector xvPart of speech vector xpAnd dependent branch vector xt. The three vectors are spliced to form an input vector x of the word segmentation, namely:
Figure BDA0002330830460000021
step 2: a hidden state vector is obtained. Respectively training from the head and the tail of the semantic dependent branch of the trigger word by utilizing a cyclic neural network to obtain the forward propagation information h of the event sentenceleftAnd back propagation information hrightAnd then splicing the two vectors to obtain a hidden state vector h corresponding to the input vector x, namely:
h=[hleft;hright](2)
step3, calculating attention weight vectors except for trigger words, wherein different participles in an event sentence have different influence degrees on event time sequence, and adjusting the influence degrees among different words by introducing the attention weight vector β1,h2,h3,…,hmAnd m is the number of word segmentation. Then, calculation of the attention weight vector excluding the trigger word is started.
ui=tanh(Wuhi+bu) ⑶
Figure BDA0002330830460000031
Wherein, WuIs a weight vector, buAs an offset value, βiFor a certain participle h in an event sentenceiT represents the position index of the trigger word in the event sentence.
Fusing each participle to different degrees according to the attention weight vector obtained by calculation, and splicing the participle with the trigger word vector to obtain an event sentence state vector e*Namely:
Figure BDA0002330830460000032
wherein h istRepresenting a trigger word hidden state vector.
Step 4: and (6) classifying the result. The experimental corpus is trained in the form of event sentence pairs, namely, two event sentences exist in a line of corpus, and after each event sentence is trained through the steps, state vectors are respectively obtained
Figure BDA0002330830460000033
And
Figure BDA0002330830460000034
the two vectors are spliced and then put into a softmax function for classification, and the most possible time sequence relation is predicted, namely:
Figure BDA0002330830460000035
Figure BDA0002330830460000036
wherein, WleftAnd WrightRepresenting a weight vector, W, with respect to the state vectorclassRepresenting weight vectors with respect to classification, bclassRepresenting the bias value for the classification.
The invention has the beneficial effects that:
(1) the invention provides a cyclic neural network event time sequence relation identification method based on a semantic attention mechanism. The method can effectively capture the semantic information hidden in the event sentence, and effective fusion and connection can be established among different participles.
(2) The invention provides a recurrent neural network event time sequence relation identification method based on a semantic attention mechanism, which calculates attention weight vectors except for trigger words. Different participles in the event sentence have different degrees of influence on the event time sequence, different vectors are fused in different degrees by calculating the attention weight vector, and because the trigger word vector is the most important vector in the event sentence, the trigger word is not put into the calculation of the attention weight vector, but is additionally spliced in the subsequent vector calculation fusion, so that the trigger word information is completely reserved.
Drawings
FIG. 1 is an example 1 of triggering word sense dependent branches referred to in the identification of the recurrent neural network event timing relationship based on the semantic attention mechanism proposed by the present invention.
FIG. 2 is an example 2 of trigger word sense dependent branches referred to in the identification of recurrent neural network event timing relationships based on the semantic attention mechanism proposed by the present invention.
FIG. 3 is a flow chart of the recurrent neural network event timing relationship identification based on the semantic attention mechanism proposed by the present invention.
FIG. 4 is a model diagram of the recurrent neural network event timing relationship identification based on the semantic attention mechanism proposed in the present invention.
Detailed Description
For a better understanding of the present invention, the invention will be further explained with reference to the accompanying drawings and specific examples, wherein the following detailed description is given:
the invention comprises the following steps:
step 1: and constructing a trigger word sense dependent branch. Trigger is a predicate used to identify an event, and there are many verbs and nouns in general. Firstly, carrying out syntactic dependency relationship analysis on an input event sentence to obtain a complete dependency syntactic tree, searching the position of a trigger word, and searching a father node and a brother node of the trigger word until a root node is finished; if the trigger is not a leaf node, its child nodes are recursively searched downward from the trigger position. Through experimental result analysis, the effect is best when the two times of downward recursion searching are carried out, the method can effectively capture the semantic information hidden in the event sentence, and effective fusion and connection can be established among different participles. And combining the two parts of information to form a trigger word meaning dependence branch. Each participle in the trigger word sense dependent branch has three corresponding vectors, namely a word vector xvPart of speech vector xpAnd dependent branch vector xt. The three vectors are spliced to form an input vector x of the word segmentation, namely:
Figure BDA0002330830460000051
for example, for a Timebank-Dense corpus:
event sentence 1: conseco Inc. sand it is sealing for the reconstruction on Dec 7 of the 800000remaining shares.
Event sentence 2: the actual center all conversion rights on The stockwell terminate on Nov 30.
And analyzing the syntactic dependency relationship of the event sentence to obtain a complete dependency syntactic tree, wherein the specific tree structure is shown in fig. 1 and fig. 2. Then finding the specific positions of the triggering words "capturing" and "terminate", and starting from the current position of the triggering word, searching the participles related to the triggering word according to the above rules. For the first event sentence, the participles related to the triggering word "catching" are "said", "it", "is", "for", "redaction" by searching; for the second event sentence, the participle related to the trigger word "terminate" is "said", "rights", "will", "all", "conversion", "on", "Nov" by search. By triggering the word dependence branches, the parts of speech and the dependence relationship of the words can be acquired. Converting them into corresponding vector information, and obtaining the vector representation e of the event sentence 11={xsaid,xit,xis,xcalling,xfor,xredemptionVector representation e of f and event sentence 22={xsaid,xrights,xall,xwill,xterminate,xconversion,xon,xNov}。
Step 2: a hidden state vector is obtained. Respectively training from the head and the tail of the semantic dependent branch of the trigger word by utilizing a cyclic neural network to obtain the forward propagation information h of the event sentenceleftAnd back propagation information hrightAnd then splicing the two vectors to obtain a hidden state vector h corresponding to the input vector x, namely:
h=[hleft;hright]⑼
e.g. event sentence e described above1And e2Training can obtain corresponding hidden state vector
Figure BDA0002330830460000052
And
Figure BDA0002330830460000053
Figure BDA0002330830460000054
step3, calculating attention weight vectors except for trigger words, wherein different participles in an event sentence have different influence degrees on event time sequences, and the influence degrees among different words are adjusted by introducing the attention weight vector β1,h2,h3,…,hmAnd m is the number of word segmentation. Then, calculation of the attention weight vector excluding the trigger word is started.
ui=tanh(Wuhi+bu) (10)
Figure BDA0002330830460000061
Wherein, WuIs a weight vector, buAs an offset value, βiFor a certain participle h in an event sentenceiT represents the position index of the trigger word in the event sentence.
Fusing each participle to different degrees according to the attention weight vector obtained by calculation, and splicing the participle with the trigger word vector to obtain an event sentence state vector e*Namely:
Figure BDA0002330830460000062
wherein h istRepresenting a trigger word hidden state vector.
Event sentence h such as described abovee1Invoking the above formula can obtain usaid=tanh(Wuhsaid+bu),uit=tanh(Wuhit+bu),ufor=tanh(Wuhfor+bu) And uredemption=tanh(Wuhredemption+bu) And each participle is then given a corresponding attention weight value βsaid,βit,βforAnd βredemption. Then calculate the event sentence he1State vector of
Figure BDA0002330830460000063
Figure BDA0002330830460000064
Similarly, an event sentence h can be obtainede2State vector of
Figure BDA0002330830460000065
Step 4: and (6) classifying the result. The experimental corpus is trained in the form of event sentence pairs, namely, two event sentences exist in a line of corpus, and after each event sentence is trained through the steps, state vectors are respectively obtained
Figure BDA0002330830460000066
And
Figure BDA0002330830460000067
the two vectors are spliced and then put into a softmax function for classification, and the most possible time sequence relation is predicted, namely:
Figure BDA0002330830460000068
Figure BDA0002330830460000069
wherein, WleftAnd WrightRepresenting a weight vector, W, with respect to the state vectorclassRepresenting weight vectors with respect to classification, bclassRepresenting the bias value for the classification.
Event sentence state vectors such as those described above
Figure BDA0002330830460000071
And
Figure BDA0002330830460000072
and vector splicing and putting the vector spliced data into a softmax method to obtain an array with the length of 6. Because six relationships are defined in the Timebank-detect corpus, an array of length 6 is created. The probability of the time sequence relation 'BEFORE' is presumed to be the maximum according to the result, so that the time sequence relation of the predicted events 'catching' and 'terminate' is 'BEFORE'.
The experiment used the accuracy P, recall R and F1 values as evaluation criteria. Five different experimental tasks are set in the experiment, and the DP-based LSTMs models proposed by CNN, LSTM, Bi-LSTM and Cheng Fei and the method provided by the invention are respectively used for training and comparison, and the actual results are shown in the table:
TABLE 1 comparative results of the experiments
Figure BDA0002330830460000073
As can be seen from the above training data, Bi-LSTM has better processing effect than the traditional CNN model. Because CNN can only extract the feature with unchanged position in the process of capturing the word feature of the sentence and lacks the consideration of the global context information, and the hidden state of Bi-LSTM can fully memorize and learn the information of the whole context, better performance effect is obtained. The DP-based LSTMs model is different from the input vector of the Bi-LSTM, the DP-based LSTMs model only intercepts the shortest dependent path, ignores the information of the neighbor nodes of the trigger word part, causes the omission of some important semantic information, and only uses the single-layer LSTM for training, and has a little deviation of the actual effect. An attention mechanism is introduced on the basis of the Bi-LSTM model, and the experimental effect is further improved. The attention mechanism is that for the Bi-LSTM model, the influence degree of different participles on the context can be obtained, so that the context relation between the participles and the event trigger word can be fully mined, and finally the time sequence relation of the event candidate pair is correctly predicted.
The input vectors of this experiment included word vector WvPart of speech vector WpAnd dependent branch vector Wt. The three vectors are combined, and the influence of different vector combinations on the event timing identification is observed.
TABLE 2 different input vector combinations influence the results
Figure BDA0002330830460000081
As can be seen from the results in table 2, when a word vector, a part-of-speech vector, and a dependent branch vector are simultaneously input during input, context semantic information can be sufficiently represented, and event timing identification can be performed better.
The embodiments of the present invention are explained in detail with reference to the drawings, but the embodiments of the present invention are not limited thereto, and modifications and substitutions by other skilled persons based on the present invention are within the protection scope of the present invention.

Claims (1)

1. A recurrent neural network event time sequence relation identification method based on semantic attention comprises the following steps:
step 1: constructing a semantic dependent branch of the trigger word;
analyzing the syntactic dependency relationship of the event sentence to obtain a complete dependency syntactic tree, finding the position of the trigger word, obtaining the father brother node of the trigger word, and then upwards recursively finding the father node of the trigger word until the root node is finished; if the trigger word is not a leaf node, searching a child node of the trigger word from the position of the trigger word in a downward recursion mode;
combining the two parts of information obtained by the upward and downward recursive search to form a trigger word semantic dependence branch; wherein each participle in the trigger word sense dependent branch has three corresponding vectors, namely a word vector xvPart of speech vector xpAnd dependent branch vector xt(ii) a The three vectors are spliced to form an input vector x of the word segmentation, namely:
Figure FDA0002330830450000011
step 2: acquiring a hidden state vector;
respectively training from the head and the tail of the semantic dependent branch of the trigger word by utilizing a cyclic neural network to obtain a forward propagation information vector h of the event sentenceleftAnd a back propagation information vector hrightAnd then splicing the two vectors to obtain a hidden state vector h corresponding to the input vector x, namely:
h=[hleft;hright](2)
step 3: calculating attention weight vectors except for the trigger words;
let the hidden state vector of the trigger word sense dependent branch be h ═ h1,h2,h3,…,hmM is the number of word segments; calculating attention weight vectors except for the trigger words;
ui=tanh(Wuhi+bu) ⑶
Figure FDA0002330830450000012
wherein, WuIs a weight vector, buAs an offset value, βiFor a certain participle h in an event sentenceiT represents a position subscript of the trigger word in the event sentence;
fusing each participle to different degrees according to the attention weight vector obtained by calculation, and splicing the participle with the trigger word vector to obtain an event sentence state vector e*Namely:
Figure FDA0002330830450000021
wherein h istRepresenting a hidden state vector of the trigger word, W1And W2Is a shared learned weight vector;
step 4: classifying results;
after each event sentence passes through the steps, state vectors are respectively obtained
Figure FDA0002330830450000022
And
Figure FDA0002330830450000023
the two vectors are spliced and then put into a softmax function for classification, and the most possible time sequence relation is predicted, namely:
Figure FDA0002330830450000024
Figure FDA0002330830450000025
wherein, WclassRepresenting weight vectors with respect to classification, bclassRepresenting the bias value for the classification.
CN201911335582.8A 2019-12-23 2019-12-23 Cyclic neural network event time sequence relation identification method based on semantic attention Pending CN111160027A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911335582.8A CN111160027A (en) 2019-12-23 2019-12-23 Cyclic neural network event time sequence relation identification method based on semantic attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911335582.8A CN111160027A (en) 2019-12-23 2019-12-23 Cyclic neural network event time sequence relation identification method based on semantic attention

Publications (1)

Publication Number Publication Date
CN111160027A true CN111160027A (en) 2020-05-15

Family

ID=70557807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911335582.8A Pending CN111160027A (en) 2019-12-23 2019-12-23 Cyclic neural network event time sequence relation identification method based on semantic attention

Country Status (1)

Country Link
CN (1) CN111160027A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507077A (en) * 2020-12-15 2021-03-16 杭州电子科技大学 Event time sequence relation identification method based on relational graph attention neural network
CN113761337A (en) * 2020-12-31 2021-12-07 国家计算机网络与信息安全管理中心 Event prediction method and device based on implicit elements and explicit relations of events
US11573992B2 (en) 2020-06-30 2023-02-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, electronic device, and storage medium for generating relationship of events
CN111859911B (en) * 2020-07-28 2023-07-25 中国平安人寿保险股份有限公司 Image description text generation method, device, computer equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334213A (en) * 2019-07-09 2019-10-15 昆明理工大学 The Chinese based on bidirectional crossed attention mechanism gets over media event sequential relationship recognition methods

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334213A (en) * 2019-07-09 2019-10-15 昆明理工大学 The Chinese based on bidirectional crossed attention mechanism gets over media event sequential relationship recognition methods

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11573992B2 (en) 2020-06-30 2023-02-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, electronic device, and storage medium for generating relationship of events
CN111859911B (en) * 2020-07-28 2023-07-25 中国平安人寿保险股份有限公司 Image description text generation method, device, computer equipment and storage medium
CN112507077A (en) * 2020-12-15 2021-03-16 杭州电子科技大学 Event time sequence relation identification method based on relational graph attention neural network
CN112507077B (en) * 2020-12-15 2022-05-20 杭州电子科技大学 Event time sequence relation identification method based on relational graph attention neural network
CN113761337A (en) * 2020-12-31 2021-12-07 国家计算机网络与信息安全管理中心 Event prediction method and device based on implicit elements and explicit relations of events
CN113761337B (en) * 2020-12-31 2023-10-27 国家计算机网络与信息安全管理中心 Event prediction method and device based on implicit event element and explicit connection

Similar Documents

Publication Publication Date Title
CN110895932B (en) Multi-language voice recognition method based on language type and voice content collaborative classification
CN110134757B (en) Event argument role extraction method based on multi-head attention mechanism
CN109543183B (en) Multi-label entity-relation combined extraction method based on deep neural network and labeling strategy
CN111241294B (en) Relationship extraction method of graph convolution network based on dependency analysis and keywords
Alayrac et al. Unsupervised learning from narrated instruction videos
CN111160027A (en) Cyclic neural network event time sequence relation identification method based on semantic attention
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN109635108B (en) Man-machine interaction based remote supervision entity relationship extraction method
CN111353306B (en) Entity relationship and dependency Tree-LSTM-based combined event extraction method
CN109710744B (en) Data matching method, device, equipment and storage medium
Fleischman et al. Maximum entropy models for FrameNet classification
CN109857846B (en) Method and device for matching user question and knowledge point
CN107608960B (en) Method and device for linking named entities
CN112163425A (en) Text entity relation extraction method based on multi-feature information enhancement
CN113761893B (en) Relation extraction method based on mode pre-training
CN113505209A (en) Intelligent question-answering system for automobile field
US20200089756A1 (en) Preserving and processing ambiguity in natural language
US20220414463A1 (en) Automated troubleshooter
CN112507077B (en) Event time sequence relation identification method based on relational graph attention neural network
CN113821605A (en) Event extraction method
CN114217766A (en) Semi-automatic demand extraction method based on pre-training language fine-tuning and dependency characteristics
CN113705237A (en) Relation extraction method and device fusing relation phrase knowledge and electronic equipment
CN113590810A (en) Abstract generation model training method, abstract generation device and electronic equipment
CN113535897A (en) Fine-grained emotion analysis method based on syntactic relation and opinion word distribution
CN115017268A (en) Heuristic log extraction method and system based on tree structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination