WO2023050470A1 - Event detection method and apparatus based on multi-layer graph attention network - Google Patents

Event detection method and apparatus based on multi-layer graph attention network Download PDF

Info

Publication number
WO2023050470A1
WO2023050470A1 PCT/CN2021/123249 CN2021123249W WO2023050470A1 WO 2023050470 A1 WO2023050470 A1 WO 2023050470A1 CN 2021123249 W CN2021123249 W CN 2021123249W WO 2023050470 A1 WO2023050470 A1 WO 2023050470A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
vector
context
word
syntactic
Prior art date
Application number
PCT/CN2021/123249
Other languages
French (fr)
Chinese (zh)
Inventor
包先雨
吴共庆
何俐娟
柯培超
陆振亚
王歆
程立勋
蔡伊娜
郑文丽
慕容灏鼎
蔡屹
Original Assignee
深圳市检验检疫科学研究院
合肥工业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市检验检疫科学研究院, 合肥工业大学 filed Critical 深圳市检验检疫科学研究院
Publication of WO2023050470A1 publication Critical patent/WO2023050470A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The present application provides an event detection method and apparatus based on a multi-layer graph attention network, the method comprising: acquiring a context word in event text information, and determining a syntactic information adjacency matrix and splicing vector corresponding to the context word; by using the adjacency matrix and the splicing vector as the input of an artificial neural network, acquiring an output vector; aggregating and generating aggregation information according to the splicing vector and the output vector; and, according to the aggregation information, determining a trigger word category of the context word. In the present application, by means of simultaneously combining syntactic information and context information of a context word, the problem that information loss and error propagation easily occur when using a syntactic analysis tool can be effectively solved; by means of combining a jump connection module into a graph attention network layer, the situation in which the classification of a final trigger word is not ideal due to excessive propagation of some short-distance syntactic information can be avoided, thereby effectively improving the accuracy, recall rate, and F1 value of trigger word classification.

Description

一种基于多层图注意力网络的事件检测方法及装置A method and device for event detection based on multi-layer graph attention network 技术领域technical field
本申请涉及自然语言处理领域,特别是一种基于多层图注意力网络的事件检测方法及装置。The present application relates to the field of natural language processing, in particular to an event detection method and device based on a multi-layer graph attention network.
背景技术Background technique
知识图谱(Knowledge Graph)以结构化的形式描述客观世界中概念、实体及其关系,将互联网的信息表达成更接近人类认知世界的形式,提供了一种更好地组织、管理和理解互联网海量信息的能力。知识图谱于2012年由谷歌提出并成功应用于搜索引擎,知识图谱属于人工智能重要研究领域——知识工程的研究范畴,是利用知识工程建立大规模知识资源的一个杀手锏应用。典型的例子是谷歌收购Freebase(一个免费的知识数据库)后在2012年推出的知识图谱,Facebook(社交网络服务网站)的图谱搜索,Microsoft Satori(微软)以及商业、金融、生命科学等领域特定的知识库。Knowledge Graph describes concepts, entities and their relationships in the objective world in a structured form, and expresses Internet information in a form closer to the human cognitive world, providing a better way to organize, manage and understand the Internet. The capacity of massive amounts of information. The knowledge map was proposed by Google in 2012 and successfully applied to search engines. The knowledge map belongs to the important research field of artificial intelligence - the research category of knowledge engineering. It is a killer application of using knowledge engineering to build large-scale knowledge resources. Typical examples are the knowledge map launched by Google in 2012 after acquiring Freebase (a free knowledge database), the graph search of Facebook (social network service website), Microsoft Satori (Microsoft), and specific fields such as business, finance, and life sciences. knowledge base.
知识图谱中的事件知识隐含在互联网资源中,包括已有的结构化的语义知识、数据库的结构化信息、半结构化的信息资源以及非结构化资源,不同性质的资源有不同的知识获取方法。事件的识别和抽取研究的是如何从描述事件信息的文本中识别并抽取出事件信息并以结构化的形式呈现出来,包括其发生的时间、地点、参与角色以及与之相关的动作或者状态的改变。The event knowledge in the knowledge graph is implicit in Internet resources, including existing structured semantic knowledge, structured information in databases, semi-structured information resources, and unstructured resources. Different types of resources have different knowledge acquisition method. Event identification and extraction research is how to identify and extract event information from the text describing event information and present it in a structured form, including the time, place, participating roles, and related actions or states. Change.
传统的事件检测方法忽略了句子中词与词之间包含的句法特征,仅利用了句子级别的特征,事件检测容易因为单词存在歧义从而导致触发词的识别效率和分类精度低。而近年来,利用句法信息来改善事件检测的方法被证明十分有效。例如,论文《融合句法信息的无触发词事件检测方法》提出了利用句法信息,同时结合注意力机制(ATTENTION),实现了连接句子中的分散的事件信息来提高事件检测的准确性;论文《融合依存信息和卷积神经网络的越南语新闻事件检测》利用融合依存句法信息的卷积编码非连续词之间的特征,再融合两部分特征作为事件编码,进而实现事件检测。Traditional event detection methods ignore the syntactic features contained between words in a sentence and only use sentence-level features. Event detection is prone to low recognition efficiency and classification accuracy of trigger words due to ambiguity in words. In recent years, the method of using syntactic information to improve event detection has been proved to be very effective. For example, the paper "No Trigger Word Event Detection Method Fused with Syntactic Information" proposes to use syntactic information and combine attention mechanism (ATTENTION) to realize the connection of scattered event information in sentences to improve the accuracy of event detection; the paper " Vietnamese News Event Detection by Fusion of Dependency Information and Convolutional Neural Network"Using the features between the convolutional codes of the fusion of dependency syntactic information to encode the features between discontinuous words, and then fusing the two parts of features as event codes to realize event detection.
发明内容Contents of the invention
鉴于所述问题,提出了本申请以便提供克服所述问题或者至少部分地解决所述问题的一种基于多层图注意力网络的事件检测方法,包括步骤:In view of the problem, this application is proposed to provide a multi-layer graph attention network-based event detection method that overcomes the problem or at least partially solves the problem, comprising the steps of:
一种基于多层图注意力网络的事件检测方法,包括步骤:An event detection method based on a multi-layer graph attention network, comprising the steps of:
获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;Acquiring the context words in the event text information, and determining the syntactic information adjacency matrix and splicing vector corresponding to the context words;
将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;Using the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain an output vector;
依据所述拼接向量与所述输出向量聚合生成聚合信息;Aggregating and generating aggregation information according to the stitching vector and the output vector;
依据所述聚合信息确定所述上下文单词的触发词类别。The trigger word category of the context word is determined according to the aggregation information.
进一步地,所述获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量的步骤,包括:Further, the step of acquiring the context words in the event text information and determining the syntactic information adjacency matrix and splicing vector corresponding to the context words includes:
依据所述上下文单词确定与所述上下文单词对应的句法信息;determining syntactic information corresponding to the context word according to the context word;
依据所述句法信息生成所述句法信息邻接矩阵;generating the syntax information adjacency matrix according to the syntax information;
依据所述上下文单词的词嵌入向量生成所述拼接向量。The stitching vector is generated according to the word embedding vector of the context word.
进一步地,所述依据所述上下文单词确定与所述上下文单词对应的句法信息的步骤,包括:Further, the step of determining the syntactic information corresponding to the context word according to the context word includes:
通过句法依存对所述事件文本信息进行分析,并依据所述事件文本信息的分析结果生成所述上下文单词对应的句法信息。The event text information is analyzed through syntactic dependence, and syntactic information corresponding to the context words is generated according to the analysis result of the event text information.
进一步地,所述将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量的步骤,包括:Further, the step of using the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain the output vector includes:
将同一批次的所述邻接矩阵生成一个张量;Generate a tensor from the adjacency matrix of the same batch;
将所述张量和所述拼接向量输入到人工神经网络进行计算,并依据所述人工神经网络计算的结果生成所述输出向量。The tensor and the splicing vector are input into the artificial neural network for calculation, and the output vector is generated according to the calculation result of the artificial neural network.
进一步地,所述依据所述聚合信息确定所述上下文单词的触发词类别的步骤,包括:Further, the step of determining the trigger word category of the context word according to the aggregation information includes:
依据所述聚合信息确定所述上下文单词的触发词,并依据分类器模块对所述触发词进行分类。The trigger words of the context words are determined according to the aggregation information, and the trigger words are classified according to the classifier module.
一种基于多层图注意力网络的事件检测装置,包括:An event detection device based on a multi-layer graph attention network, comprising:
获取模块,用于获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;An acquisition module, configured to acquire context words in the event text information, and determine a syntactic information adjacency matrix and splicing vectors corresponding to the context words;
计算模块,用于将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;A calculation module, configured to use the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain an output vector;
聚合模块,用于依据所述拼接向量与所述输出向量聚合生成聚合信息;An aggregation module, configured to generate aggregation information according to the aggregation of the spliced vector and the output vector;
分类模块,用于依据所述聚合信息确定所述上下文单词的触发词类别。A classification module, configured to determine the trigger word category of the context word according to the aggregation information.
进一步地,所述获取模块,包括:Further, the acquisition module includes:
表达子模块,用于依据所述上下文单词确定与所述上下文单词对应的句法信息;An expression submodule, configured to determine syntactic information corresponding to the context word according to the context word;
生成子模块,用于依据所述句法信息生成所述句法信息邻接矩阵;A generating submodule, configured to generate the syntax information adjacency matrix according to the syntax information;
拼接子模块,用于依据所述上下文单词的词嵌入向量生成所述拼接向量。The splicing submodule is used to generate the splicing vector according to the word embedding vector of the context word.
进一步地,所述表达子模块,包括:Further, the expression submodule includes:
依存分析子模块,用于通过句法依存对所述事件文本信息进行分析,并依据所述事件文本信息的分析结果生成所述上下文单词对应的句法信息。The dependency analysis sub-module is configured to analyze the event text information through syntactic dependencies, and generate syntactic information corresponding to the context words according to the analysis result of the event text information.
进一步地,所述计算模块,包括:Further, the calculation module includes:
数组转换子模块,用于将同一批次的所述邻接矩阵生成一个张量;The array conversion submodule is used to generate a tensor from the adjacency matrix of the same batch;
人工神经网络计算子模块,用于将所述张量和所述拼接向量输入到人工神经网络进行计算,并依据所述人工神经网络计算的结果生成所述输出向量。The artificial neural network calculation sub-module is used to input the tensor and the stitching vector into the artificial neural network for calculation, and generate the output vector according to the calculation result of the artificial neural network.
进一步地,所述分类模块,包括:Further, the classification module includes:
触发词处理子模块,用于依据所述聚合信息确定所述上下文单词的触发词,并依据分类器模块对所述触发词进行分类。The trigger word processing submodule is configured to determine the trigger word of the context word according to the aggregation information, and classify the trigger word according to the classifier module.
本申请具有以下优点:This application has the following advantages:
在本申请的实施例中,通过获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;依据所述拼接向量 与所述输出向量聚合生成聚合信息;依据所述聚合信息确定所述上下文单词的触发词类别。本申请通过同时结合上下文单词的句法信息和上下文信息,可以有效解决使用句法分析工具容易出现信息丢失和错误传播的问题;并且通过在图注意力网络层中结合跳跃连接模块,可以保留更多的原始特征,避免因为一些短距离的句法信息过度传播从而导致最终触发词的分类不理想的情况出现,有效提高触发词分类的精度、召回率以及F1值。In the embodiment of the present application, by obtaining the context words in the event text information, and determining the syntactic information adjacency matrix and splicing vector corresponding to the context words; using the adjacency matrix and the splicing vector as the artificial neural network Inputting, obtaining an output vector; generating aggregation information according to the splicing vector and the output vector; determining the trigger word category of the context word according to the aggregation information. This application can effectively solve the problem of easy information loss and error propagation when using syntactic analysis tools by combining the syntactic information and context information of context words at the same time; and by combining skip connection modules in the graph attention network layer, more can be retained The original feature avoids the unsatisfactory classification of the final trigger word due to the excessive spread of some short-distance syntactic information, and effectively improves the accuracy, recall rate and F1 value of the trigger word classification.
附图说明Description of drawings
为了更清楚地说明本申请的技术方案,下面将对本申请的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solution of the present application more clearly, the accompanying drawings that need to be used in the description of the present application will be briefly introduced below. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. Ordinary technicians can also obtain other drawings based on these drawings without paying creative labor.
图1是本申请一实施例提供的一种基于多层图注意力网络的事件检测方法的步骤流程图;Fig. 1 is a flow chart of the steps of an event detection method based on a multi-layer graph attention network provided by an embodiment of the present application;
图2是本申请一实施例提供的句法依存树的示意图;FIG. 2 is a schematic diagram of a syntax dependency tree provided by an embodiment of the present application;
图3是本申请一实施例提供的邻接矩阵的示意图;FIG. 3 is a schematic diagram of an adjacency matrix provided by an embodiment of the present application;
图4是本申请一实施例提供的一种图注意力网络的示意图;Fig. 4 is a schematic diagram of a graph attention network provided by an embodiment of the present application;
图5是本申请一实施例提供的一种基于多层图注意力网络的事件检测方法的流程示意图;FIG. 5 is a schematic flow diagram of an event detection method based on a multi-layer graph attention network provided by an embodiment of the present application;
图6是本申请一实施例提供的一种基于多层图注意力网络的事件检测装置的结构框图。Fig. 6 is a structural block diagram of an event detection device based on a multi-layer graph attention network provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的所述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, features and advantages of the present application more obvious and understandable, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods. Apparently, the described embodiments are some of the embodiments of the present application, but not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.
参照图1,示出了本申请一实施例提供的一种基于多层图注意力网络的 事件检测方法;Referring to Fig. 1, it shows a kind of event detection method based on multi-layer graph attention network that an embodiment of the application provides;
所述方法包括:The methods include:
S110、获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;S110. Obtain context words in the event text information, and determine a syntactic information adjacency matrix and splicing vector corresponding to the context words;
S120、将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;S120. Using the adjacency matrix and the splicing vector as an input of the artificial neural network to obtain an output vector;
S130、依据所述拼接向量与所述输出向量聚合生成聚合信息;S130. Aggregate and generate aggregation information according to the stitching vector and the output vector;
S140、依据所述聚合信息确定所述上下文单词的触发词类别。S140. Determine the trigger word category of the context word according to the aggregation information.
在本申请的实施例中,通过获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;依据所述拼接向量与所述输出向量聚合生成聚合信息;依据所述聚合信息确定所述上下文单词的触发词类别。本申请通过同时结合上下文单词的句法信息和上下文信息,可以有效解决使用句法分析工具容易出现信息丢失和错误传播的问题;并且通过在图注意力网络层中结合跳跃连接模块,可以保留更多的原始特征,避免因为一些短距离的句法信息过度传播从而导致最终触发词的分类不理想的情况出现,有效提高触发词分类的精度、召回率以及F1值。In the embodiment of the present application, by obtaining the context words in the event text information, and determining the syntactic information adjacency matrix and splicing vector corresponding to the context words; using the adjacency matrix and the splicing vector as the artificial neural network Inputting, obtaining an output vector; generating aggregation information according to the splicing vector and the output vector; determining the trigger word category of the context word according to the aggregation information. This application can effectively solve the problem of easy information loss and error propagation when using syntactic analysis tools by combining the syntactic information and context information of context words at the same time; and by combining skip connection modules in the graph attention network layer, more can be retained The original feature avoids the unsatisfactory classification of the final trigger word due to the excessive spread of some short-distance syntactic information, and effectively improves the accuracy, recall rate and F1 value of the trigger word classification.
下面,将对本示例性实施例中基于多层图注意力网络的事件检测方法作进一步地说明。In the following, the event detection method based on the multi-layer graph attention network in this exemplary embodiment will be further described.
如所述步骤S110所述,获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量。As described in step S110, the context words in the event text information are obtained, and the syntactic information adjacency matrix and splicing vector corresponding to the context words are determined.
在本申请一实施例中,可以结合下列描述进一步说明步骤S110所述“获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量”的具体过程。In an embodiment of the present application, the specific process of "obtaining the context words in the event text information and determining the syntactic information adjacency matrix and splicing vectors corresponding to the context words" in step S110 can be further described in conjunction with the following description.
如下列步骤所述,依据所述上下文单词确定与所述上下文单词对应的句法信息;As described in the following steps, determine the syntactic information corresponding to the context word according to the context word;
在本申请一实施例中,可以结合下列描述进一步说明所述“依据所述上下文单词确定与所述上下文单词对应的句法信息”的具体过程。In an embodiment of the present application, the specific process of "determining syntactic information corresponding to the context word according to the context word" may be further described in conjunction with the following description.
如下列步骤所述,通过句法依存对所述事件文本信息进行分析,并依据所述事件文本信息的分析结果生成所述上下文单词对应的句法信息。As described in the following steps, the event text information is analyzed through syntactic dependencies, and syntactic information corresponding to the context words is generated according to the analysis result of the event text information.
需要说明的是,所述句法依存为通过分析语言单位内成分之间的依存关系揭示其句法结构,句法依存分析识别句子中的“主谓宾”以及“定状补”这些语法成分,并强调分析单词之间的关系。句法依存分析中句子的核心是谓语动词,然后再围绕谓语找出其他成分,最后将句子分析成一颗句法依存树。句法依存树可以描述出各个单词之间的依存关系。It should be noted that the syntactic dependence is to reveal the syntactic structure by analyzing the interdependence relationship between the components in the language unit. The syntactic dependence analysis identifies the grammatical components such as "subject-verb-object" and "fixed complement" in the sentence, and emphasizes Analyze the relationship between words. The core of the sentence in syntactic dependency analysis is the predicate verb, and then find out other components around the predicate, and finally analyze the sentence into a syntactic dependency tree. The syntactic dependency tree can describe the dependency relationship between each word.
在一具体实现中,获取事件文本信息,对事件文本信息进行识别,利用Stanford Core NLP(StandFordNatural Language Processing,斯坦福自然语言处理工具)进行句法依存分析,对事件文本中的每个句子进行分析,识别句子中的事件触发词,并强调分析事件触发词与事件参数,和/或,事件参数与事件参数之间的依存关系,形成一颗句法依存树。In a specific implementation, the event text information is obtained, the event text information is identified, and the syntax dependency analysis is performed using Stanford Core NLP (StandFord Natural Language Processing, Stanford Natural Language Processing Tool), and each sentence in the event text is analyzed and identified. The event trigger words in the sentence, and emphasize the analysis of the dependency relationship between the event trigger words and event parameters, and/or, the event parameters and event parameters, to form a syntactic dependency tree.
其中,事件触发词是指一个事件中最能代表事件发生的词,是事件概念在词和短语层面的投射,是事件识别的基础和凭借,同时也是决定事件类别的重要特征,一般是动词或名词;事件参数指描述事件发生的时间、地点和人物等信息。Among them, the event trigger word refers to the word that can best represent the occurrence of the event in an event. It is the projection of the concept of the event at the word and phrase levels. Noun; event parameters refer to information describing the time, place, and person of an event.
参照图2,示出了本申请一实施例提供的句法依存树的示意图。如图所示,句子“我去北京天安门看太阳升起”,在构建的句法依存树中,可以看到句子的核心谓词为“去”,是句法依存树的根部,“去”的主语是“我”,“去”的宾语是“北京天安门”,另一个动词“看”的宾语是“太阳”,句法依存树可以描述出上下文单词之间的依存关系。Referring to FIG. 2 , it shows a schematic diagram of a syntax dependency tree provided by an embodiment of the present application. As shown in the figure, the sentence "I went to Beijing Tiananmen Square to watch the sun rise", in the constructed syntactic dependency tree, we can see that the core predicate of the sentence is "go", which is the root of the syntactic dependency tree, and the subject of "go" is The object of "I" and "go" is "Beijing Tiananmen", and the object of another verb "look" is "sun". The syntactic dependency tree can describe the dependency relationship between context words.
如下列步骤所述,依据所述句法信息生成所述句法信息邻接矩阵;As described in the following steps, generating the syntax information adjacency matrix according to the syntax information;
需要说明的是,所述邻接矩阵是表示顶点之间相邻关系的矩阵。设G=(V,E)是一个图,其中V={v 1,v 2,…,v n},V是顶点,E是边,用一个一维数组存放图中所有顶点数据,用一个二维数组存放顶点间关系(边或弧)的数据,这个二维数组称为邻接矩阵。邻接矩阵又分为有向图邻接矩阵和无向图邻接矩阵。G的邻接矩阵是一个具有下列性质的n阶方阵:对无向图而言,邻接矩阵一定是对称的,而且主对角线一定为零,副对角线不一定为0,有 向图则不一定如此。在无向图中,任一顶点i的度为第i列(或第i行)所有非零元素的个数,在有向图中顶点i的出度为第i行所有非零元素的个数,而入度为第i列所有非零元素的个数,采用有向图的邻接矩阵存储两个事件参数之间的句法依存关系。 It should be noted that the adjacency matrix is a matrix representing the adjacency relationship between vertices. Suppose G=(V, E) is a graph, where V={v 1 ,v 2 ,…,v n }, V is a vertex, E is an edge, use a one-dimensional array to store all vertex data in the graph, use a A two-dimensional array stores data of relationships (edges or arcs) between vertices, and this two-dimensional array is called an adjacency matrix. Adjacency matrix is divided into directed graph adjacency matrix and undirected graph adjacency matrix. The adjacency matrix of G is an nth-order square matrix with the following properties: For an undirected graph, the adjacency matrix must be symmetric, and the main diagonal must be zero, and the subdiagonal must not be 0, and the directed graph Not necessarily so. In an undirected graph, the degree of any vertex i is the number of all non-zero elements in the i-th column (or i-th row), and the out-degree of a vertex i in a directed graph is the number of all non-zero elements in the i-th row number, and the in-degree is the number of all non-zero elements in the i-th column, and the adjacency matrix of the directed graph is used to store the syntactic dependency between two event parameters.
作为一种示例,每个句子依据通过句法依存分析形成句法依存树,再依据句法依存树生成对应的邻接矩阵。As an example, each sentence forms a syntactic dependency tree through syntactic dependency analysis, and then generates a corresponding adjacency matrix according to the syntactic dependency tree.
在一具体实现中,参照图3,示出了本申请一实施例提供的邻接矩阵的示意图,图3所示的邻接矩阵对应图2所示的句法依存树。图2中的触发词为“去”,“北京”以及“天安门”为并列的宾语,因而在对应的邻接矩阵中,“去”所在行与“北京”“天安门”所在列值的交叉位置处,值均为1。每个单词作为一个节点,“我”“去”“北京”“天安门”“看”“太阳”“升起”为七个单词,所以为7X7的方阵。两个单词间存在句法弧则矩阵对应位置为1,否则为0。采用有向图的邻接矩阵进行存储文本的句法依存关系,若词语之间存在依存关系,则对应的邻接矩阵元素值为1,不存在依存关系的词语之间,对应的邻接矩阵元素为0。通过所述邻接矩阵可以表示所述上下文单词之间的依存关系。In a specific implementation, refer to FIG. 3 , which shows a schematic diagram of an adjacency matrix provided by an embodiment of the present application. The adjacency matrix shown in FIG. 3 corresponds to the syntax dependency tree shown in FIG. 2 . The trigger word in Figure 2 is "go", "Beijing" and "Tiananmen" are parallel objects, so in the corresponding adjacency matrix, the intersection position of the row of "go" and the column value of "Beijing" and "Tiananmen" , the value is 1. Each word is used as a node, "I", "Go", "Beijing", "Tiananmen", "Look", "Sun", and "Rise" are seven words, so it is a 7X7 square matrix. If there is a syntactic arc between two words, the corresponding position of the matrix is 1, otherwise it is 0. The adjacency matrix of the directed graph is used to store the syntactic dependencies of the text. If there is a dependency between words, the corresponding adjacency matrix element value is 1, and between words without dependency relationship, the corresponding adjacency matrix element is 0. The dependency relationship between the context words can be represented by the adjacency matrix.
如下列步骤所述,依据所述上下文单词的词嵌入向量生成所述拼接向量。As described in the following steps, the stitching vector is generated according to the word embedding vector of the context word.
需要说明的是,需要将句子中的词级信息转换成实值向量作为人工神经网络的输入。设X={x1,x2,x3,…,xn}是一个长度为n的句子,其中xi是句子中的第i个词。在自然语言处理任务中,词的语义信息与其在句中的位置有关,词性和实体类型信息对触发词的识别和语义的理解有提升的作用。本申请将上下文单词的词义向量、实体向量、词性向量和位置向量拼接成的拼接向量作为人工神经网络的输入。It should be noted that the word-level information in the sentence needs to be converted into a real-valued vector as the input of the artificial neural network. Let X = {x1,x2,x3,...,xn} be a sentence of length n, where xi is the ith word in the sentence. In natural language processing tasks, the semantic information of a word is related to its position in the sentence, and the part-of-speech and entity type information can improve the recognition of trigger words and the understanding of semantics. In this application, the concatenated vector formed by concatenating the meaning vector, entity vector, part-of-speech vector and position vector of the context word is used as the input of the artificial neural network.
在一具体实现中,将所述上下文单词的词义向量、实体向量、词性向量和位置向量共4种不同的词嵌入向量拼接成第一拼接向量,再将第一拼接向量输入到Bi-LSTM神经网络层生成第二拼接向量,第二拼接向量作为多层图注意力网络的输入向量之一,所述拼接向量可以获取上下文单词间的语义 信息。In a specific implementation, four different word embedding vectors including the meaning vector, entity vector, part-of-speech vector and position vector of the context word are spliced into the first spliced vector, and then the first spliced vector is input to the Bi-LSTM neural network The network layer generates a second concatenated vector, which is used as one of the input vectors of the multi-layer graph attention network, and the concatenated vector can obtain semantic information between context words.
如所述步骤S120所述,将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量。As described in step S120, the adjacency matrix and the concatenated vector are used as the input of the artificial neural network to obtain an output vector.
需要说明的是,所述人工神经网络为多层图注意力网络(Graph Attention Networks)。由于传统的图卷积网络存在种种局限,导致不能很好地处理有向图、无法适用于inductive任务(inductive任务是指:训练阶段与测试阶段需要处理的图结构是不同的)以及不能处理动态图,而图注意力网络可以很好地解决图卷积网络的缺陷,对于每一个节点都可以使用注意力机制计算节点j对于节点i的相似系数,如此一来就可以不用完全依靠图结构,也可以适用于inductive任务。在图注意力网络下,即使我们在预测过程中改变图的结构,对于图注意力网络的影响也不大,只需要调整参数,重新计算即可。图注意力网络的运算方式是逐顶点的运算,每一次运算都需要循环遍历图上的所有顶点来完成。逐顶点运算意味着,摆脱了原始图结构中拉普拉斯矩阵(Laplacian matrix)的束缚,使得有向图问题迎刃而解。It should be noted that the artificial neural network is a multi-layer graph attention network (Graph Attention Networks). Due to the various limitations of the traditional graph convolutional network, it cannot handle directed graphs well, cannot be applied to inductive tasks (inductive tasks refer to: the graph structure that needs to be processed in the training phase and the testing phase is different) and cannot handle dynamic Graph, and the graph attention network can solve the defects of the graph convolutional network very well. For each node, the attention mechanism can be used to calculate the similarity coefficient of node j to node i, so that it does not need to rely entirely on the graph structure. It can also be applied to inductive tasks. Under the graph attention network, even if we change the structure of the graph during the prediction process, the impact on the graph attention network is not great, only need to adjust the parameters and recalculate. The operation method of the graph attention network is a vertex-by-vertex operation, and each operation needs to cycle through all the vertices on the graph to complete. Vertex-by-vertex operation means getting rid of the constraints of the Laplacian matrix in the original graph structure, so that the directed graph problem can be easily solved.
在本申请一实施例中,可以结合下列描述进一步说明步骤S120所述“将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量”的具体过程。In an embodiment of the present application, the specific process of "using the adjacency matrix and the concatenated vector as the input of the artificial neural network to obtain the output vector" in step S120 can be further described in conjunction with the following description.
如下列步骤所述,将同一批次的所述邻接矩阵生成一个张量;Generate a tensor from the adjacency matrix of the same batch as described in the following steps;
在一具体实现中,事件文本信息中同一时刻识别的句子为一个批次,将同一个批次的句子的所述邻接矩阵形成一个张量,邻接矩阵集合表示为
Figure PCTCN2021123249-appb-000001
形成张量表示为A∈R N*N*K,其中,K=|T V|,N是节点个数。
In a specific implementation, the sentences recognized at the same time in the event text information are a batch, and the adjacency matrix of the sentences of the same batch is formed into a tensor, and the adjacency matrix set is expressed as
Figure PCTCN2021123249-appb-000001
Forming a tensor is expressed as A∈R N*N*K , where K=|T V |, N is the number of nodes.
如下列步骤所述,将所述张量和所述拼接向量输入到人工神经网络进行计算,并依据所述人工神经网络计算的结果生成所述输出向量。As described in the following steps, the tensor and the concatenated vector are input to the artificial neural network for calculation, and the output vector is generated according to the calculation result of the artificial neural network.
作为一种示例,参照图4,示出了本申请一实施例提供的一种图注意力网络的示意图,分为计算注意力系数和加权求和两个步骤。将所述张量和第二拼接向量作为图注意力层的输入,表示为
Figure PCTCN2021123249-appb-000002
其中N为节点个数,F为节点特征的个数;输出为
Figure PCTCN2021123249-appb-000003
其中 F'表示新的节点特征向量维度。计算节点i与周围邻居节点j∈N i的注意力系数,如图4左侧所示,计算公式表示如下:
As an example, referring to FIG. 4 , it shows a schematic diagram of a graph attention network provided by an embodiment of the present application, which is divided into two steps of calculating the attention coefficient and weighted summation. The tensor and the second concatenated vector are used as the input of the graph attention layer, expressed as
Figure PCTCN2021123249-appb-000002
Where N is the number of nodes, F is the number of node features; the output is
Figure PCTCN2021123249-appb-000003
where F' represents the new node feature vector dimension. Calculate the attention coefficient of node i and surrounding neighbor nodes j∈N i , as shown on the left side of Figure 4, the calculation formula is as follows:
Figure PCTCN2021123249-appb-000004
Figure PCTCN2021123249-appb-000004
其中,a是一个R F′×R F′→R的映射,W∈R F′×F是权重矩阵。 Among them, a is a mapping of RF′ ×RF →R, and W∈RF ′×F is the weight matrix.
所述图注意力网络对于每一个节点都可以使用注意力机制计算节点i与邻居节点j的相似系数权重,如此一来就可以不用完全依靠图结构。The graph attention network can use the attention mechanism to calculate the similarity coefficient weights between node i and neighbor node j for each node, so that it does not need to rely entirely on the graph structure.
通过softmax将注意力系数进行归一化操作,计算公式表示如下:The attention coefficient is normalized by softmax, and the calculation formula is expressed as follows:
Figure PCTCN2021123249-appb-000005
Figure PCTCN2021123249-appb-000005
其中,||表示向量拼接,e ij和α ij都叫做"注意力系数",α ij是在e ij基础上进行归一化后的。 Among them, || represents vector concatenation, both e ij and α ij are called "attention coefficients", and α ij is normalized on the basis of e ij .
所有节点的注意力系数归一化后,对邻居节点的特征进行加权求和,生成输出向量,计算公式表示如下:After the attention coefficients of all nodes are normalized, the features of neighbor nodes are weighted and summed to generate an output vector. The calculation formula is as follows:
Figure PCTCN2021123249-appb-000006
Figure PCTCN2021123249-appb-000006
其中,W为与特征相乘的权重矩阵,σ为非线性激活函数,j∈N i中遍历的j表示所有与i相邻的节点。 Among them, W is the weight matrix multiplied with the feature, σ is the nonlinear activation function, and the j traversed in j∈N i represents all the nodes adjacent to i.
如图4右侧所示,为三层图注意力网络,多层注意力机制对不同的特征 分配不同的注意力权重。对于多层图注意力网络,计算公式如下:As shown on the right side of Figure 4, it is a three-layer graph attention network, and the multi-layer attention mechanism assigns different attention weights to different features. For a multi-layer graph attention network, the calculation formula is as follows:
Figure PCTCN2021123249-appb-000007
Figure PCTCN2021123249-appb-000007
若将多层图注意力网络应用到输出层,则计算公式表示如下:If the multi-layer graph attention network is applied to the output layer, the calculation formula is expressed as follows:
Figure PCTCN2021123249-appb-000008
Figure PCTCN2021123249-appb-000008
如所述步骤S130所述,依据所述拼接向量与所述输出向量聚合生成聚合信息。As described in the step S130, aggregation information is generated according to the aggregation of the spliced vector and the output vector.
作为一种示例,在每一层的图注意力网络中,通过跳跃连接模块(Skip-Connection)来实现句法信息的聚合,将所述拼接向量通过跳跃连接模块跳过每一层的图注意力网络,并与所述输出向量进行聚合操作。通过跳跃连接模块可以防止短距离的句法信息的过度传播,可以保留更多的原始句法信息,避免出现最终的触发词分类效果不佳的情况。As an example, in the graph attention network of each layer, the aggregation of syntactic information is realized through the skip connection module (Skip-Connection), and the splicing vector is skipped through the graph attention of each layer through the skip connection module network, and perform an aggregation operation with the output vector. The over-spreading of short-distance syntactic information can be prevented through the skip connection module, more original syntactic information can be retained, and the final trigger word classification effect is avoided.
如所述步骤S140所述,依据所述聚合信息确定所述上下文单词的触发词类别。As described in the step S140, the trigger word category of the context word is determined according to the aggregation information.
在本申请一实施例中,可以结合下列描述进一步说明步骤S140所述“依据所述聚合信息确定所述上下文单词的触发词类别”的具体过程。In an embodiment of the present application, the specific process of "determining the trigger word category of the context word according to the aggregation information" in step S140 can be further described in conjunction with the following description.
如下列步骤所述,依据所述聚合信息确定所述上下文单词的触发词,并依据分类器模块对所述触发词进行分类。As described in the following steps, the trigger word of the context word is determined according to the aggregation information, and the trigger word is classified according to the classifier module.
作为一种示例,依据所述聚合信息确定所述上下文单词的触发词,通过分类器模块的预设条件,对所述触发词进行分类,并依据所述触发词的分类类别,确定事件句对应的事件类型。所述事件类型为预先定义好的不同种类。As an example, the trigger words of the context words are determined according to the aggregation information, the trigger words are classified according to the preset conditions of the classifier module, and the corresponding event sentences are determined according to the classification categories of the trigger words. event type. The event types are pre-defined different types.
具体地,分类器模块的预设条件为聚合不同模块的信息,经过一个全连接层,再通过softmax函数(softmax函数将多个神经元的输出,映射到(0,1)区间内,可以看成概率来理解,从而进行多分类)选择每个上下文单词对应的类别概率中最大的那个类别作为当前触发词预测的标签。Specifically, the preset condition of the classifier module is to aggregate the information of different modules, pass through a fully connected layer, and then map the output of multiple neurons to the (0,1) interval through the softmax function (softmax function, you can see into a probability to understand, so as to perform multi-classification) select the category with the largest category probability corresponding to each context word as the label of the current trigger word prediction.
下面对本申请实施例提出的一种基于多层图注意力网络的事件检测方法进行实验论证:The following is an experimental demonstration of an event detection method based on a multi-layer graph attention network proposed in the embodiment of the present application:
实验环境:Pytorch-1.8.0(开源的Python机器学习库),Nvidia GeForce RTX 3060(显卡芯片),Windows 10(计算机操作系统),Inter i7-11700k,内存16G,硬盘1T。Experimental environment: Pytorch-1.8.0 (open source Python machine learning library), Nvidia GeForce RTX 3060 (graphics chip), Windows 10 (computer operating system), Inter i7-11700k, memory 16G, hard disk 1T.
实验数据如表1所示:The experimental data are shown in Table 1:
表1实验对比结果Table 1 Experimental comparison results
Figure PCTCN2021123249-appb-000009
Figure PCTCN2021123249-appb-000009
实验结果:实验以精度(Precision,P),召回率(Recall,R),F1值(F1-score)作为观察变量,P、R、F1的定义式如下所示:Experimental results: The experiment uses precision (Precision, P), recall (Recall, R), and F1 value (F1-score) as observation variables. The definitions of P, R, and F1 are as follows:
Figure PCTCN2021123249-appb-000010
Figure PCTCN2021123249-appb-000010
Figure PCTCN2021123249-appb-000011
Figure PCTCN2021123249-appb-000011
Figure PCTCN2021123249-appb-000012
Figure PCTCN2021123249-appb-000012
为了保证实验的准确度,本实验中数据集的划分与其他事件检测方法的数据集的划分保持一致,实验结果证明通过本实施例提出的事件检测方法相较于传统只利用句子级别特征的事件检测方法,F1-score要高出8%左右;与基于图神经网络的方法相比,本实施例提出的事件方法在F1-score和Recall上也取得了最高值。In order to ensure the accuracy of the experiment, the division of the data set in this experiment is consistent with the division of the data set of other event detection methods. The experimental results prove that the event detection method proposed in this embodiment is better than the traditional event detection method that only uses sentence-level features. In the detection method, the F1-score is about 8% higher; compared with the method based on the graph neural network, the event method proposed in this embodiment also achieved the highest values in F1-score and Recall.
参照图5,示出了一种基于多层图注意力网络的事件检测方法的流程示意图;Referring to FIG. 5 , a schematic flow diagram of an event detection method based on a multi-layer graph attention network is shown;
在一具体实现中,在获取事件文本信息后,通过句法分析技术对所述事件文本信息进行分析并生成句法依存树,再依据所述句法依存树生成所述上下文单词对应的邻接矩阵,并将同一批次句子的所述邻接矩阵生成一个张量;将所述上下文单词的共4种不同的词嵌入向量拼接成第一拼接向量,并将第一拼接向量输入到Bi-LSTM神经网络层生成第二拼接向量,将所述邻接矩阵和所述第二拼接向量输入多层图注意力网络中生成输出向量,来对不同深度的句法信息做聚合操作;将所述拼接向量再通过跳跃连接模块跳过多层图注意力网络来做聚合操作;将所述输出向量与所述拼接向量进行聚合,通过分类器模块对所述上下文单词进行触发词的分类,确定事件句对应的事件类型。In a specific implementation, after the event text information is acquired, the event text information is analyzed by syntactic analysis technology to generate a syntactic dependency tree, and then an adjacency matrix corresponding to the context word is generated according to the syntactic dependency tree, and The adjacency matrix of the same batch of sentences generates a tensor; a total of 4 different word embedding vectors of the context words are spliced into the first spliced vector, and the first spliced vector is input to the Bi-LSTM neural network layer to generate The second splicing vector, inputting the adjacency matrix and the second splicing vector into a multi-layer graph attention network to generate an output vector to aggregate syntactic information of different depths; passing the splicing vector through a skip connection module Skip the multi-layer graph attention network to do the aggregation operation; aggregate the output vector and the splicing vector, and classify the trigger words of the context words through the classifier module to determine the event type corresponding to the event sentence.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
参照图6,示出了本申请一实施例提供的一种基于多层图注意力网络的事件检测装置;Referring to FIG. 6 , it shows an event detection device based on a multi-layer graph attention network provided by an embodiment of the present application;
具体包括:Specifically include:
获取模块610,用于获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;An acquisition module 610, configured to acquire context words in the event text information, and determine a syntactic information adjacency matrix and splicing vectors corresponding to the context words;
计算模块620,用于将所述邻接矩阵和所述拼接向量作为人工神经网络 的输入,获取输出向量; Calculation module 620, is used for using described adjacency matrix and described stitching vector as the input of artificial neural network, obtains output vector;
聚合模块630,用于依据所述拼接向量与所述输出向量聚合生成聚合信息; Aggregation module 630, configured to generate aggregation information according to the aggregation of the spliced vector and the output vector;
分类模块640,用于依据所述聚合信息确定所述上下文单词的触发词类别。A classification module 640, configured to determine the trigger word category of the context word according to the aggregation information.
在本申请一实施例中,所述获取模块610,包括:In an embodiment of the present application, the acquisition module 610 includes:
表达子模块,用于依据所述上下文单词确定与所述上下文单词对应的句法信息;An expression submodule, configured to determine syntactic information corresponding to the context word according to the context word;
生成子模块,用于依据所述句法信息生成所述句法信息邻接矩阵;A generating submodule, configured to generate the syntax information adjacency matrix according to the syntax information;
拼接子模块,用于依据所述上下文单词的词嵌入向量生成所述拼接向量。The splicing submodule is used to generate the splicing vector according to the word embedding vector of the context word.
在本申请一实施例中,所述表达子模块,包括:In an embodiment of the present application, the expression submodule includes:
依存分析子模块,用于通过句法依存对所述事件文本信息进行分析,并依据所述事件文本信息的分析结果生成所述上下文单词对应的句法信息。The dependency analysis sub-module is configured to analyze the event text information through syntactic dependencies, and generate syntactic information corresponding to the context words according to the analysis result of the event text information.
在本申请一实施例中,所述计算模块620,包括:In an embodiment of the present application, the calculation module 620 includes:
数组转换子模块,用于将同一批次的所述邻接矩阵生成一个张量;The array conversion submodule is used to generate a tensor from the adjacency matrix of the same batch;
人工神经网络计算子模块,用于将所述张量和所述拼接向量输入到人工神经网络进行计算,并依据所述人工神经网络计算的结果生成所述输出向量。The artificial neural network calculation sub-module is used to input the tensor and the stitching vector into the artificial neural network for calculation, and generate the output vector according to the calculation result of the artificial neural network.
在本申请一实施例中,所述分类模块640,包括:In an embodiment of the present application, the classification module 640 includes:
触发词处理子模块,用于依据所述聚合信息确定所述上下文单词的触发词,并依据分类器模块对所述触发词进行分类。The trigger word processing submodule is configured to determine the trigger word of the context word according to the aggregation information, and classify the trigger word according to the classifier module.
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。While the preferred embodiments of the embodiments of the present application have been described, additional changes and modifications can be made to these embodiments by those skilled in the art once the basic inventive concept is understood. Therefore, the appended claims are intended to be interpreted to cover the preferred embodiment and all changes and modifications that fall within the scope of the embodiments of the application.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求 或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should also be noted that in this text, relational terms such as first and second etc. are only used to distinguish one entity or operation from another, and do not necessarily require or imply that these entities or operations, any such actual relationship or order exists. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or terminal equipment comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements identified, or also include elements inherent in such a process, method, article, or end-equipment. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or terminal device comprising said element.
以上对本申请所提供的一种基于多层图注意力网络的事件检测方法及装置,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The event detection method and device based on a multi-layer graph attention network provided by this application have been introduced in detail above. In this paper, specific examples are used to illustrate the principle and implementation of this application. The description of the above embodiments It is only used to help understand the method of the present application and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in the specific implementation and application scope. In summary, The contents of this specification should not be understood as limiting the application.

Claims (10)

  1. 一种基于多层图注意力网络的事件检测方法,其特征在于,包括步骤:A kind of event detection method based on multi-layer graph attention network, it is characterized in that, comprising steps:
    获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;Acquiring the context words in the event text information, and determining the syntactic information adjacency matrix and splicing vector corresponding to the context words;
    将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;Using the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain an output vector;
    依据所述拼接向量与所述输出向量聚合生成聚合信息;Aggregating and generating aggregation information according to the stitching vector and the output vector;
    依据所述聚合信息确定所述上下文单词的触发词类别。The trigger word category of the context word is determined according to the aggregation information.
  2. 根据权利要求1所述的基于多层图注意力网络的事件检测方法,其特征在于,所述获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量的步骤,包括:The event detection method based on the multi-layer graph attention network according to claim 1, wherein the context words in the acquisition event text information are determined, and the syntactic information adjacency matrix and splicing vector corresponding to the context words are determined steps, including:
    依据所述上下文单词确定与所述上下文单词对应的句法信息;determining syntactic information corresponding to the context word according to the context word;
    依据所述句法信息生成所述句法信息邻接矩阵;generating the syntax information adjacency matrix according to the syntax information;
    依据所述上下文单词的词嵌入向量生成所述拼接向量。The stitching vector is generated according to the word embedding vector of the context word.
  3. 根据权利要求2所述的基于多层图注意力网络的事件检测方法,其特征在于,所述依据所述上下文单词确定与所述上下文单词对应的句法信息的步骤,包括:The event detection method based on multi-layer graph attention network according to claim 2, wherein the step of determining the syntactic information corresponding to the context word according to the context word comprises:
    通过句法依存对所述事件文本信息进行分析,并依据所述事件文本信息的分析结果生成所述上下文单词对应的句法信息。The event text information is analyzed through syntactic dependence, and syntactic information corresponding to the context words is generated according to the analysis result of the event text information.
  4. 根据权利要求1所述的基于多层图注意力网络的事件检测方法,其特征在于,所述将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量的步骤,包括:The event detection method based on the multi-layer graph attention network according to claim 1, wherein the step of using the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain the output vector comprises:
    将同一批次的所述邻接矩阵生成一个张量;Generate a tensor from the adjacency matrix of the same batch;
    将所述张量和所述拼接向量输入到人工神经网络进行计算,并依据所述人工神经网络计算的结果生成所述输出向量。The tensor and the splicing vector are input into the artificial neural network for calculation, and the output vector is generated according to the calculation result of the artificial neural network.
  5. 根据权利要求1所述的基于多层图注意力网络的事件检测方法,其特征在于,所述依据所述聚合信息确定所述上下文单词的触发词类别的步骤,包括:The event detection method based on the multi-layer graph attention network according to claim 1, wherein the step of determining the trigger word category of the context word according to the aggregation information includes:
    依据所述聚合信息确定所述上下文单词的触发词,并依据分类器模块对所述触发词进行分类。The trigger words of the context words are determined according to the aggregation information, and the trigger words are classified according to the classifier module.
  6. 一种基于多层图注意力网络的事件检测装置,其特征在于,包括:An event detection device based on a multi-layer graph attention network, characterized in that it includes:
    获取模块,用于获取事件文本信息中的上下文单词,并确定与所述上下文单词对应的句法信息邻接矩阵和拼接向量;An acquisition module, configured to acquire context words in the event text information, and determine a syntactic information adjacency matrix and splicing vectors corresponding to the context words;
    计算模块,用于将所述邻接矩阵和所述拼接向量作为人工神经网络的输入,获取输出向量;A calculation module, configured to use the adjacency matrix and the splicing vector as the input of the artificial neural network to obtain an output vector;
    聚合模块,用于依据所述拼接向量与所述输出向量聚合生成聚合信息;An aggregation module, configured to generate aggregation information according to the aggregation of the spliced vector and the output vector;
    分类模块,用于依据所述聚合信息确定所述上下文单词的触发词类别。A classification module, configured to determine the trigger word category of the context word according to the aggregation information.
  7. 根据权利要求6所述的基于多层图注意力网络的事件检测装置,其特征在于,所述获取模块,包括:The event detection device based on the multi-layer graph attention network according to claim 6, wherein the acquisition module includes:
    表达子模块,用于依据所述上下文单词确定与所述上下文单词对应的句法信息;An expression submodule, configured to determine syntactic information corresponding to the context word according to the context word;
    生成子模块,用于依据所述句法信息生成所述句法信息邻接矩阵;A generating submodule, configured to generate the syntax information adjacency matrix according to the syntax information;
    拼接子模块,用于依据所述上下文单词的词嵌入向量生成所述拼接向量。The splicing submodule is used to generate the splicing vector according to the word embedding vector of the context word.
  8. 根据权利要求7所述的基于多层图注意力网络的事件检测装置,其特征在于,所述表达子模块,包括:The event detection device based on multi-layer graph attention network according to claim 7, wherein the expression submodule comprises:
    依存分析子模块,用于通过句法依存对所述事件文本信息进行分析,并依据所述事件文本信息的分析结果生成所述上下文单词对应的句法信息。The dependency analysis sub-module is configured to analyze the event text information through syntactic dependencies, and generate syntactic information corresponding to the context words according to the analysis result of the event text information.
  9. 根据权利要求6所述的基于多层图注意力网络的事件检测装置,其特征在于,所述计算模块,包括:The event detection device based on the multi-layer graph attention network according to claim 6, wherein the calculation module includes:
    数组转换子模块,用于将同一批次的所述邻接矩阵生成一个张量;The array conversion submodule is used to generate a tensor from the adjacency matrix of the same batch;
    人工神经网络计算子模块,用于将所述张量和所述拼接向量输入到人工神经网络进行计算,并依据所述人工神经网络计算的结果生成所述输出向量。The artificial neural network calculation sub-module is used to input the tensor and the spliced vector into the artificial neural network for calculation, and generate the output vector according to the calculation result of the artificial neural network.
  10. 根据权利要求6所述的基于多层图注意力网络的事件检测装置,其特征在于,所述分类模块,包括:The event detection device based on multi-layer graph attention network according to claim 6, wherein the classification module includes:
    触发词处理子模块,用于依据所述聚合信息确定所述上下文单词的触发词,并依据分类器模块对所述触发词进行分类。The trigger word processing submodule is configured to determine the trigger word of the context word according to the aggregation information, and classify the trigger word according to the classifier module.
PCT/CN2021/123249 2021-09-30 2021-10-12 Event detection method and apparatus based on multi-layer graph attention network WO2023050470A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111164755.1 2021-09-30
CN202111164755.1A CN113887213A (en) 2021-09-30 2021-09-30 Event detection method and device based on multilayer graph attention network

Publications (1)

Publication Number Publication Date
WO2023050470A1 true WO2023050470A1 (en) 2023-04-06

Family

ID=79005069

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/123249 WO2023050470A1 (en) 2021-09-30 2021-10-12 Event detection method and apparatus based on multi-layer graph attention network

Country Status (2)

Country Link
CN (1) CN113887213A (en)
WO (1) WO2023050470A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303996A (en) * 2023-05-25 2023-06-23 江西财经大学 Theme event extraction method based on multifocal graph neural network
CN116629237A (en) * 2023-07-25 2023-08-22 江西财经大学 Event representation learning method and system based on gradually integrated multilayer attention
CN116701576A (en) * 2023-08-04 2023-09-05 华东交通大学 Event detection method and system without trigger words

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259142A (en) * 2020-01-14 2020-06-09 华南师范大学 Specific target emotion classification method based on attention coding and graph convolution network
US20200356628A1 (en) * 2019-05-07 2020-11-12 International Business Machines Corporation Attention-based natural language processing
CN112163416A (en) * 2020-10-09 2021-01-01 北京理工大学 Event joint extraction method for merging syntactic and entity relation graph convolution network
CN112347248A (en) * 2020-10-30 2021-02-09 山东师范大学 Aspect-level text emotion classification method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200356628A1 (en) * 2019-05-07 2020-11-12 International Business Machines Corporation Attention-based natural language processing
CN111259142A (en) * 2020-01-14 2020-06-09 华南师范大学 Specific target emotion classification method based on attention coding and graph convolution network
CN112163416A (en) * 2020-10-09 2021-01-01 北京理工大学 Event joint extraction method for merging syntactic and entity relation graph convolution network
CN112347248A (en) * 2020-10-30 2021-02-09 山东师范大学 Aspect-level text emotion classification method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BAI XUEFENG; LIU PENGBO; ZHANG YUE: "Investigating Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network", IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, IEEE, USA, vol. 29, 2 December 2020 (2020-12-02), USA, pages 503 - 514, XP011829612, ISSN: 2329-9290, DOI: 10.1109/TASLP.2020.3042009 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303996A (en) * 2023-05-25 2023-06-23 江西财经大学 Theme event extraction method based on multifocal graph neural network
CN116303996B (en) * 2023-05-25 2023-08-04 江西财经大学 Theme event extraction method based on multifocal graph neural network
CN116629237A (en) * 2023-07-25 2023-08-22 江西财经大学 Event representation learning method and system based on gradually integrated multilayer attention
CN116629237B (en) * 2023-07-25 2023-10-10 江西财经大学 Event representation learning method and system based on gradually integrated multilayer attention
CN116701576A (en) * 2023-08-04 2023-09-05 华东交通大学 Event detection method and system without trigger words
CN116701576B (en) * 2023-08-04 2023-10-10 华东交通大学 Event detection method and system without trigger words

Also Published As

Publication number Publication date
CN113887213A (en) 2022-01-04

Similar Documents

Publication Publication Date Title
WO2023050470A1 (en) Event detection method and apparatus based on multi-layer graph attention network
Aggarwal et al. Classification of fake news by fine-tuning deep bidirectional transformers based language model
EP3958145A1 (en) Method and apparatus for semantic retrieval, device and storage medium
WO2022001333A1 (en) Hyperbolic space representation and label text interaction-based fine-grained entity recognition method
CN116304748B (en) Text similarity calculation method, system, equipment and medium
JP7369228B2 (en) Method, device, electronic device, and storage medium for generating images of user interest
Liu et al. Relation classification via BERT with piecewise convolution and focal loss
Mohan et al. Sarcasm Detection Using Bidirectional Encoder Representations from Transformers and Graph Convolutional Networks
CN112906368B (en) Industry text increment method, related device and computer program product
Chen et al. Learning a general clause-to-clause relationships for enhancing emotion-cause pair extraction
Lee et al. Detecting suicidality with a contextual graph neural network
CN113821588A (en) Text processing method and device, electronic equipment and storage medium
WO2023077562A1 (en) Graph perturbation strategy-based event detection method and apparatus
KR102567896B1 (en) Apparatus and method for religious sentiment analysis using deep learning
WO2023137903A1 (en) Reply statement determination method and apparatus based on rough semantics, and electronic device
CN116257632A (en) Unknown target position detection method and device based on graph comparison learning
Luo et al. A survey of transformer and GNN for aspect-based sentiment analysis
Lou Deep learning-based sentiment analysis of movie reviews
Xu et al. AHRNN: Attention‐Based Hybrid Robust Neural Network for emotion recognition
Aljebreen et al. Moth Flame Optimization with Hybrid Deep Learning based Sentiment Classification Towards ChatGPT on Twitter
Liu Emotional Analysis of Short Text Based on Multiscale Improved CNN Model Under Tenorflow
Sharma et al. 6 Natural Language
de Carvalho et al. Efficient Neural-based patent document segmentation with Term Order Probabilities.
CN116737931A (en) Aspect-level emotion analysis method based on interactive graph rolling network
Deng et al. Analysis of the Effectiveness of CNN-LSTM Models Incorporating Bert and Attention Mechanisms in Sentiment Analysis of Data Reviews

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21959034

Country of ref document: EP

Kind code of ref document: A1