WO2023051399A1 - 一种基于本体指导的生成式事件抽取方法 - Google Patents

一种基于本体指导的生成式事件抽取方法 Download PDF

Info

Publication number
WO2023051399A1
WO2023051399A1 PCT/CN2022/120840 CN2022120840W WO2023051399A1 WO 2023051399 A1 WO2023051399 A1 WO 2023051399A1 CN 2022120840 W CN2022120840 W CN 2022120840W WO 2023051399 A1 WO2023051399 A1 WO 2023051399A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
ontology
extraction
argument
type
Prior art date
Application number
PCT/CN2022/120840
Other languages
English (en)
French (fr)
Inventor
陈华钧
叶宏彬
张宁豫
邓淑敏
毕祯
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US18/280,655 priority Critical patent/US20240143633A1/en
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2023051399A1 publication Critical patent/WO2023051399A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Definitions

  • the invention relates to the technical field of information extraction in natural language processing, in particular to a generative event extraction method based on ontology guidance.
  • an event is defined as an objective fact that a specific person or object interacts at a specific time and a specific place, generally at the sentence level.
  • TDT Topic Detection Tracking
  • an event is a set of related descriptions about a topic, which can be formed by classification or clustering.
  • the elements that make up an event include: trigger word (trigger word), event type (Event Type), event argument (Event Argument) and argument role (Argument Role).
  • the event trigger word represents the core word of the event, mostly a verb or a noun.
  • the event type refers to the type to which the event belongs.
  • the event argument refers to the participants of the event, mainly composed of entity, value, and time, and the value is a non-entity event participant, such as a job position.
  • Argument roles refer to the roles that event arguments play in an event.
  • Event extraction is to extract events of interest to users from unstructured information and present them to users in a structured manner.
  • the event extraction task can be decomposed into 4 subtasks: trigger word recognition, event type classification, argument recognition and role classification tasks.
  • trigger word recognition and event type classification can be combined into an event recognition task.
  • Event recognition and classification determine the type of event to which each word in a sentence belongs. It is a word-based multi-classification task.
  • Argument recognition and role classification can be combined into an argument role classification task.
  • the role classification task is a multi-classification task based on word pairs, which determines the role relationship between any pair of trigger words and entities in a sentence.
  • event extraction is to identify event triggers using parameters in text, usually formulated as a classification or structured prediction problem. For example, enter the sentence "The divorce settlement called for Giuliani to pay Hanover more than $6.8million.”, event extraction should extract two events, one is "Life:Divorce” event type, the trigger word is “divorce”, by an event parameter Composition: Argument span of "Giuliani” and role type of "Person”.
  • the other is the "Transaction:Transfer-Money” event type, the trigger word is "pay”, and consists of three event parameters: the argument span is "Giuliani”, the role type is “Giver”; the argument span is "$6.8million” , the role type is "Money”; the argument span is "Hanover”, and the role type is "Recipient”. Sentences containing multiple events bring more challenges to event extraction. In addition, argument spans also have overlapping issues in different events, such as "Giuliani” in the example sentence needs to play different argument roles in two different types of events at the same time.
  • the purpose of the present invention is to provide an ontology-guided generative event extraction method to efficiently extract event structured knowledge in fully supervised and few-shot scenarios.
  • a generative event extraction method guided by ontology comprising the following steps:
  • Step 1 build event ontology knowledge base according to domain knowledge base and event annotation framework
  • Step 2 design the event trigger word extraction template and event argument extraction template for generative event extraction;
  • the event trigger word extraction template maps the input event text to the first input sequence of the event extraction model;
  • the event argument extraction template will be integrated into the event ontology The input event text of is mapped to the second input sequence of the event extraction model;
  • Step 3 designing a class label mapping function, the mapping class label function handles the mapping of multi-word labels to event types and/or role types;
  • Step 4 for the input event text, extract the event ontology corresponding to the input event from the event ontology knowledge base, and construct the first input sequence and the second input sequence;
  • Step 5 the first input sequence and the second input sequence are input into the event extraction model, and the event extraction model predicts the event type and role type according to the class label mapping function and its own processing mechanism, and simultaneously outputs the event trigger word span and event argument span.
  • the beneficial effects of the present invention at least include:
  • the event ontology and the added prompt words are integrated into the input sequence, so that the event ontology knowledge is injected into the event extraction model, implicitly Modeling the correlation between event trigger words and event arguments formally, and then using the cue words of the fusion event ontology to guide the generation of event sequence text, which improves the performance in fully supervised and few-sample scenarios, and improves event extraction
  • the convergence speed of the model improves the speed and accuracy of multi-event extraction and overlapping event argument extraction, which has certain industrial practical value.
  • Fig. 1 is a flow chart of an ontology-guided generative event extraction method provided by an embodiment
  • Fig. 2 is an overall framework diagram of an ontology-guided generative event extraction method provided by an embodiment
  • Fig. 3 is a model structure diagram of event trigger word extraction and event type classification provided by an embodiment
  • Fig. 4 is a model structure diagram of event argument extraction and event role classification provided by an embodiment.
  • an ontology-guided generative event extraction method which reorganizes structured event information into text information as supervision, and uses an end-to-end language generation model to guide the generation of sequence text containing event information.
  • an event ontology knowledge base is constructed for each sub-event type through the external knowledge base, and the connection between events is established through a propagation algorithm, and the event ontology knowledge base is serialized and integrated with the prompt template, for The model injects event ontology knowledge to implicitly model the correlation between event trigger words and event arguments.
  • a fine-tuning method based on cue words reduces the gap between the pre-trained model and the fine-tuning task, and improves the transfer of knowledge in the pre-trained model to downstream tasks and adaptive efficiency.
  • Fig. 1 is a flowchart of an ontology-guided generative event extraction method provided by an embodiment
  • Fig. 2 is an overall framework diagram of an ontology-guided generative event extraction method provided by an embodiment, as shown in Fig. 1 and Fig. 2
  • the generative event extraction method provided by the embodiment includes the following steps:
  • Step 1 Construct event ontology knowledge base according to domain knowledge base and event annotation framework.
  • the process of building an event ontology knowledge base is:
  • Step 1.1 using the event framework predefined by ACE as the target event ontology, the ACE corpus is various types of data composed of entities, relations and event annotations released by the Language Data Consortium (LDC), where the ACE corpus provides annotations for event information a detailed framework;
  • LDC Language Data Consortium
  • Step 1.2 extract the event frame related to the target event ontology in FrameNet as the extended event ontology
  • FrameNet uses frame semantics as the theoretical basis, so that the meaning of most words can be passed through the semantic frame (for events, relations or entities and their participation Author’s description) to get the best understanding, we use the semantic framework related to event ontology to expand the construction source of event ontology;
  • Step 1.3 Integrate the target event ontology and the extended event ontology, perform deduplication and manual inspection, and obtain the event ontology knowledge base.
  • Event Ontology such as "injure”, “divorce”, and "transfer-money” for divorce events.
  • Each event ontology has It has its own non-core event ontology.
  • the "divorce” core event ontology is associated with non-core event ontologies such as "person", “time”, “purpose”, and “partners”.
  • the "injure” core event ontology points to the "divorce” core event ontology through the "cause” relationship, indicating that the injury event is the cause of the divorce event.
  • Step 2 design the event trigger word extraction template and event argument extraction template for generative event extraction.
  • the designed extraction template is to map the input text integrated into the event ontology to the input sequence of the standard pre-trained event extraction model, that is, the event ontology and the input event text need to be jointly generated.
  • the prompt template for event extraction because Different prompt templates are designed for different tasks of trigger word extraction and event argument extraction.
  • the designed event trigger word extraction template can map the input event text to the first input sequence of the event extraction model, and the specifically designed event trigger word extraction template is:
  • the pseudo template (pseudo template) adopts the unused virtual pseudo-label in the pre-training word embedding, such as [unused1][unused2],...,[unused9], etc., for simplicity, ⁇ pseudo template> is represented by s 1 , ⁇ input sentence> is represented by s2 .
  • the designed event argument extraction template maps the input event text integrated into the event ontology into the second input sequence of the event extraction model; the designed event argument extraction template is:
  • Step 3 designing a class label mapping function, which handles the mapping of multi-word labels to event types and/or role types.
  • the embodiment designs a class label mapping function to handle the mapping of multi-word tags to event types and/or role types .
  • the designed class label mapping function is:
  • Y(r i ) represents the mapping function of the i-th event type and the multi-word label
  • w n represents the word embedding vector of the n-th lexical label of the event type
  • Y(r i ) represents the mapping function of the i-th role type and multi-word labels
  • w n represents the word embedding vector of the n-th lexical label of the role type.
  • the event type or role type can be predicted based on the above class label mapping function.
  • Step 4 constructing the input sequence of the input event extraction model for the input event text.
  • the input event text is preprocessed, specifically including deleting invalid characters in html format, such as ⁇ div>, ⁇ style>, etc., and deleting words that do not appear in the predefined vocabulary.
  • the event ontology corresponding to the input event is extracted from the event ontology knowledge base through rule matching.
  • the first input sequence is constructed according to the event trigger word extraction template and event argument extraction template and the second input sequence.
  • Step 5 using the event extraction model to perform text prediction based on the input of the first input sequence and the second input sequence.
  • the first input sequence and the second input sequence are input to the event extraction model, and the event extraction model predicts the event type and role type according to the class label mapping function and its own processing mechanism, and simultaneously outputs the event trigger word span and event argument span.
  • the event extraction model adopts an encoder-decoder Transformer framework.
  • p(r i ) represents the predicted probability of r i- th event type or role label r i
  • h [MASK] represents the output vector corresponding to the event extraction model at [MASK] position
  • w represents the target event type/role type
  • w' represents the word embedding vector of the vocabulary label of all event types/role types
  • R represents the set of all event types/role labels
  • Z is the latent source domain representation learned in encoder E
  • H denotes the latent source domain representation learned in decoder G
  • the parameters of the event extraction model are fine-tuned using the first input sequence and the second input sequence constructed based on the event trigger word extraction template and the event argument extraction template, and the fine-tuned event extraction model is used to perform prediction tasks.
  • the loss function used is:
  • the extraction template extracted for event trigger words into the standard pre-trained encoder-decoder Transformer framework in the form of text sequence, and predict the event type and event trigger word span.
  • the input sequence of the model encoder is "[unused1][unused2]...[unused9]The divorce settlement called for Giuliani to pay Hanover more than$6.8million.”
  • the model decoder output supervision sequence is "The trigger type is [MASK], trigger token is”.
  • the event trigger word type predicts the probability output of each event type at the “[MASK]” position, and arranges them in reverse order according to the probability value:
  • the event type is determined as "Life: Divorce” event type, and the event trigger word span is naturally generated by the model as "divorce”.
  • the final output sequence text is "The trigger type is Life: Divorce, trigger token is divorce.”
  • the input sequence of the model encoder is "divorce time place person fine partners injury victim agent place bodypart transfer money giver recipient money beneficial. The divorce settlement called for Giuliani to pay Hanover more than $6.8million.”
  • the output supervision sequence of the model decoder is "The argument type is [MASK], argument token is”.
  • the role type of the event argument predicts the probability output of each role type in the "[MASK]" position, sorted in reverse order according to the probability value:
  • the role type is determined as "Person", and the corresponding event argument span is naturally generated by the model as "Giuliani Hanover”.
  • the final output sequence text is "The argument type is Person, argument token is Giuliani Hanover.”
  • Step 6 Standardize the predicted event types and role types.
  • the mapping between event types and event type serial numbers, and the mapping between role types and role type serial numbers are performed.
  • Step 7 Normalize the span of the predicted event trigger word and the span of the event argument.
  • the mapping between event trigger word spans and event trigger word span labels, and the mapping between event argument spans and event argument span labels are performed.
  • Step 8 Integrate the event trigger word with the type serial number and span label of the event argument, and transfer it to the structured database for storage.
  • one event is "Life: Divorce” event type
  • the type sequence number is 14
  • the event trigger word is "divorce”
  • the span label is (2, 3)
  • the argument span is "Giuliani”
  • the span label is (6, 7)
  • the role type ordinal is "Person”
  • the role label is 35.
  • Another event is "Transaction:Transfer-Money” event type, the type number is 20, the trigger word is "pay”, the span label is (8, 9), and it consists of three event parameters: the argument span is "Giuliani”, The span label is (6, 7), the role type is “Giver”, and the role type serial number is 45; the argument span is "$6.8million”, the span label is (12, 14), the role type is "Money”, and the role type serial number is 46; the argument span is "Hanover”, the span label is (9, 10), the role type is "Recipient”, and the role type serial number is 47; finally, the structured event extraction result is transferred to the structured database for storage.
  • the generative event extraction method based on ontology guidance provided by the above embodiments can be used for structured arrangement of Internet news information, and automatically extracts news event names and associated news event arguments, which can be used for news orientation recommendation and associated document arrangement , hotspot-based search queries and other downstream scenarios to improve the speed and accuracy of news event extraction.
  • the ontology-guided generative event extraction method uses natural language processing-related technologies to extract structured knowledge from complex input event texts, uses an end-to-end language generation model as an event extraction model, and combines event ontology Knowledge-guided event extraction improves model performance in fully supervised and few-sample scenarios, and provides a better solution for efficiently extracting public corpus information.
  • the present invention introduces an event ontology library for the complex model architecture of new event types and the generalization ability of new event types, which can inject event knowledge through timely construction , with faster model convergence and significantly improved performance in the case of few samples.
  • an event ontology library for the complex model architecture of new event types and the generalization ability of new event types, which can inject event knowledge through timely construction , with faster model convergence and significantly improved performance in the case of few samples.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种基于本体指导的生成式事件抽取方法,包括:(1)构建事件本体知识库;(2)设计事件触发词提取模板和事件论元提取模板,分别将输入事件文本映射为第一输入序列,和将融入事件本体的输入事件文本映射为第二输入序列;(3)设计映射多单词标签到事件类型和/或角色类型的类标映射函数;(4)从事件本体知识库中提取与输入事件对应的事件本体后,按照事件触发词提取模板和事件论元提取模板构建第一输入序列和第二输入序列并输入事件提取模型;(5)事件提取模型根据类标映射函数和自身处理机制预测事件类型和角色类型,同时输出事件触发词跨度和事件论元跨度。该方法以实现在全监督和少样本场景下高效地抽取事件结构化知识。

Description

一种基于本体指导的生成式事件抽取方法 技术领域
本发明涉及自然语言处理中的信息抽取技术领域,特别是涉及一种基于本体指导的生成式事件抽取方法。
背景技术
事件作为信息的一种表现形式,其定义为特定的人、物在特定时间和特定地点相互作用的客观事实,一般来说是句子级的。在话题检测与跟踪(Topic Detection Tracking,TDT)中,事件是指关于某一主题的一组相关描述,这个主题可以是由分类或聚类形成的。
组成事件的各元素包括:触发词(trigger word)、事件类型(Event Type)、事件论元(Event Argument)及论元角色(Argument Role)。事件触发词表示事件发生的核心词,多为动词或名词。事件类型是指事件所属类型。事件论元是指事件的参与者,主要由实体、值、时间组成,值是一种非实体的事件参与者,例如工作岗位。论元角色是指事件论元在事件中充当的角色。
事件抽取是从非结构化信息中抽取出用户感兴趣的事件,并以结构化呈现给用户。事件抽取任务可分解为4个子任务:触发词识别、事件类型分类、论元识别和角色分类任务。其中,触发词识别和事件类型分类可合并成事件识别任务。事件识别分类判断句子中的每个单词归属的事件类型,是一个基于单词的多分类任务。论元识别和角色分类可合并成论元角色分类任务。角色分类任务则是一个基于词对的多分类任务,判断句子中任意一对触发词和实体之间的角色关系。
事件抽取的目的是用文本中的参数来识别事件触发器,通常被表述为一个分类或结构化的预测问题。例如,输入句子“The divorce settlement called for Giuliani to pay Hanover more than$6.8million.”,事件抽取应该提取两个事件,一个是“Life:Divorce”事件类型,触发词是“divorce”,由一个事件参数组成:论元跨度为“Giuliani”,角色类型为“Person”。另一个是“Transaction:Transfer-Money”事件类型,触发词是“pay”,由三个事件参数组成:论元跨度为“Giuliani”,角色类型为“Giver”;论元跨度为“$6.8million”,角色类型为“Money”;论元跨度为“Hanover”,角色类型为“Recipient”。句子中包含多个事件为事件抽取带来了更多的挑战。此外,论元跨度在不同事件中也存在重叠问题,例如示例句子中的“Giuliani”需要在两种不同类型的事件中同时扮演不同的论元角色。
传统方法采用序列标注的方法进行事件提取,然而这解决不了论元角色重叠的问题。此外,传统事件提取模型通常还存在着需要设计复杂的模型架构和对新事件类型的弱泛化性问题。
发明内容
鉴于上述,本发明的目的是提供一种基于本体指导的生成式事件抽取方法,以实现在全监督和少样本场景下高效地抽取事件结构化知识。
为实现上述发明目的,本发明提供以下技术方案:
一种基于本体指导的生成式事件抽取方法,包括以下步骤:
步骤1,根据领域知识库和事件标注框架构建事件本体知识库;
步骤2,设计生成式事件抽取的事件触发词提取模板和事件论元提取模板;事件触发词提取模板将输入事件文本映射为事件提取模型的第一输入序列;事件论元提取模板将融入事件本体的输入事件文本映射为事件提 取模型的第二输入序列;
步骤3,设计类标映射函数,该映类标射函数处理多单词标签到事件类型和/或角色类型的映射;
步骤4,对于输入事件文本,从事件本体知识库中提取与输入事件对应的事件本体,并根据输入事件文本和事件本体,按照事件触发词提取模板和事件论元提取模板构建第一输入序列和第二输入序列;
步骤5,第一输入序列和的第二输入序列输入事件提取模型,事件提取模型根据类标映射函数和自身处理机制预测事件类型和角色类型,同时输出事件触发词跨度和事件论元跨度。
与现有技术相比,本发明具有的有益效果至少包括:
在构建事件本体知识库的基础上,通过设计的事件触发词提取模板和事件论元提取模板,将事件本体和增加的提示词融入到输入序列中,这样为事件提取模型注入事件本体知识,隐式地建模事件触发词和事件论元之间的相关性,然后利用融合事件本体的提示词来指导事件序列文本的生成,提高了在全监督和少样本场景下的性能,提高了事件提取模型的收敛速度,提高了多事件抽取和重叠事件论元的抽取速度和准确性,具有一定的工业实用价值。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动前提下,还可以根据这些附图获得其他附图。
图1是一实施例提供的基于本体指导的生成式事件抽取方法的流程图;
图2是一实施例提供的基于本体指导的生成式事件抽取方法的总体框架图;
图3是一实施例提供的事件触发词抽取和事件类型分类的模型结构图;
图4是一实施例提供的事件论元抽取和事件角色分类的模型结构图。
具体实施方式
为使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例对本发明进行进一步的详细说明。应当理解,此处所描述的具体实施方式仅仅用以解释本发明,并不限定本发明的保护范围。
为了解决现有事件提取方法中存在的论元角色重叠的问题,和设计复杂的模型架构和对新事件类型的弱泛化性问题,导致的事件提取速度慢,不准确的问题,本发明提供了一种基于本体指导的生成式事件抽取方法,将结构化事件信息重组为文本信息作为监督,并使用端到端语言生成模型来指导包含事件信息的序列文本的生成。为了融入外部事件知识,通过外部知识库为每个子事件类型构建一个事件本体知识库,并通过传播算法建立事件和事件之间的联系,将事件本体知识库与提示模板进行序列化和集成,为模型注入事件本体知识,隐式地建模事件触发词和事件论元之间的相关性。在此基础上,将事件提取视为自然语言生成的新框架,基于提示词的微调方法,减少了预训练模型和微调任务之间的差距,提高了预训练模型中的知识对下游任务的转移和适应的效率。
图1是一实施例提供的基于本体指导的生成式事件抽取方法的流程图,图2是一实施例提供的基于本体指导的生成式事件抽取方法的总体框架图,如图1和图2所示,实施例提供的生成式事件抽取方法,包括以下步骤:
步骤1,根据领域知识库和事件标注框架构建事件本体知识库。
实施例中,构建事件本体知识库的过程为:
步骤1.1,利用ACE预定义的事件框架作为目标事件本体,ACE语料库是语言数据联盟(LDC)发布的由实体,关系和事件注释组成的各种类型的数据,其中ACE语料为事件信息的注释提供了详细的框架;
步骤1.2,抽取FrameNet中与目标事件本体相关的事件框架作为扩充的事件本体,FrameNet以框架语义学作为理论根据,使得大多数单词的含义都可以通过语义框架(对事件、关系或实体及其参与者的描述)得到最佳理解,我们利用与事件本体相关的语义框架扩充事件本体的构建来源;
步骤1.3,整合目标事件本体和扩充的事件本体,进行去重和人工检查,得到事件本体知识库。
以ACE2005事件抽取数据集为例,在此例子中利用步骤1方法构建了33个核心事件本体及1161个非核心事件本体,通过事件与事件的传播算法建立了28个事件与事件之间的关系本体。
以离婚事件为例展示事件抽取的框架,事件数据集的本体知识库针对离婚事件构建了“injure”、“divorce”、“transfer-money”等核心事件本体(Event Ontology),每个事件本体都有其各自的非核心事件本体,例如“divorce”核心事件本体与“person”、“time”、“purpose”、“partners”等非核心事件本体关联,此外事件与事件之间也存在着关系,例如“injure”核心事件本体通过“cause”关系指向“divorce”核心事件本体,说明伤害事件是造成离婚事件的成因。
步骤2,设计生成式事件抽取的事件触发词提取模板和事件论元提取模板。
实施例中,设计的提取模板是将融入事件本体的输入文本映射到标准的预先训练的事件提取模型的输入序列,也即是事件本体与输入事件文本 需要共同生成式事件抽取的提示模板,由于触发词抽取和事件论元抽取任务的不同,为此设计了不同的提示模板。
实施例中,设计的事件触发词提取模板能够将输入事件文本映射为事件提取模型的第一输入序列,具体设计的事件触发词提取模板为:
[第一标记符]<伪模板><输入事件文本>[第二标记符]事件触发词为[MASK],触发词令牌为,对应英文为[CLS]<pseudo template><input sentence>[sos]The trigger word is[MASK],trigger token is;
其中,伪模板(pseudo template)采用预训练词嵌入中未使用的虚拟伪标签,如[unused1][unused2],……,[unused9]等,为了简化,<pseudo template>用s 1表示,<input sentence>用s 2表示。
实施例中,设计的事件论元提取模板将融入事件本体的输入事件文本映射为事件提取模型的第二输入序列;设计的事件论元提取模板为:
[第一标记符]<事件本体><输入事件文本>[第二标记符]论元类型为[MASK],论元令牌为,对应英文为[CLS]<Event ontology><input sentence>[SOS]The argument type is[MASK],argument token is,其中,事件本体(Event ontology)采用事件本体知识库中提及的事件本体填充,为了简化,<Event ontology>用s 1表示,<input sentence>用s 2表示。
步骤3,设计类标映射函数,该映类标射函数处理多单词标签到事件类型和/或角色类型的映射。
有些时候多个单词标签会形成一个事件类型或者角色类型,为了实现事件类型和角色类型的准确预测,实施例设计了类标映射函数,以处理多单词标签到事件类型和/或角色类型的映射。
实施例中,设计的类标映射函数为:
Y(r i)={w 1,w 2,...,w n}
在进行事件类型预测时,Y(r i)表示第i个事件类型与多单词标签的映射函数,w n表示事件类型的第n个词汇标签的词嵌入向量;
在进行角色类型预测时,Y(r i)表示第i个角色类型与多单词标签的映射函数,w n表示角色类型的第n个词汇标签的词嵌入向量。
基于以上类标映射函数可以预测事件类型或者角色类型。
步骤4,对输入事件文本构建输入事件提取模型的输入序列。
实施例中,首先对输入事件文本进行预处理,具体包括删除html格式的无效字符,例如<div>、<style>等,删除未出现在预定义词表中的词汇。然后,从事件本体知识库中通过规则匹配的方式提取与输入事件对应的事件本体,接下来,根据输入事件文本和事件本体,按照事件触发词提取模板和事件论元提取模板构建第一输入序列和第二输入序列。
步骤5,利用事件提取模型基于输入第一输入序列和第二输入序列进行文本预测。
实施例中,第一输入序列和第二输入序列输入事件提取模型,事件提取模型根据类标映射函数和自身处理机制预测事件类型和角色类型,同时输出事件触发词跨度和事件论元跨度。实施例中,事件提取模型采用编码-解码器Transformer框架。
利用事件提取模型在进行事件类型和/或角色类型预测时,采用以下公式获得事件类型和/或角色类型的预测概率:
Figure PCTCN2022120840-appb-000001
其中,p(r i)表示第r i个事件类型或角色标签r i的预测概率,h [MASK]表示 事件提取模型在[MASK]位置对应的输出向量,w表示目标事件类型/角色类型的词汇标签的词嵌入向量,w'表示所有事件类型/角色类型的词汇标签的词嵌入向量,R表示所有事件类型/角色标签集合;
利用事件提取模型进行事件触发词跨度和/或事件论元跨度预测时,将事件触发词跨度和/或事件论元跨度预测建模为一个序列生成任务,对于事件文本集S,输入事件文本A,关联的事件本体O,通过训练Z=E s(A,O)与H=G s(z)学习融合事件本体的条件分布
Figure PCTCN2022120840-appb-000002
其中Z是在编码器E中通过学习得到的潜在源域表示,H表示在解码器G中通过学习得到的潜在源域表示,
Figure PCTCN2022120840-appb-000003
Figure PCTCN2022120840-appb-000004
表示源域中编码器和解码器的模型参数集,
Figure PCTCN2022120840-appb-000005
表示给定输入事件文本A和关联的事件本体O生成输出序列H的总体概率,其中,
Figure PCTCN2022120840-appb-000006
实施例中,利用基于事件触发词提取模板和事件论元提取模板构建的第一输入序列和第二输入序列对事件提取模型进行参数微调,利用微调后的事件提取模型进行预测任务。在训练时,采用的损失函数为:
Figure PCTCN2022120840-appb-000007
将针对事件触发词抽取的提取模板以文本序列的形式输入到标准的预先训练的编码-解码器Transformer框架,对事件类型及事件触发词跨度进行预测。以图3所示的事件触发词抽取和事件类型分类的模型结构图为例进行说明,模型编码器输入序列为“[unused1][unused2]…[unused9]The divorce settlement called for Giuliani to pay Hanover more than$6.8million.”,模型解码器输出监督序列为“The trigger type is[MASK],trigger token is”。
事件触发词类型预测“[MASK]”位置每个事件类型的概率输出,并 根据概率值倒序排列:
p(Divorce|s 1,s 2,[MASK])>p(Sue|s 1,s 2,[MASK])>...>p(Attack|s 1,s 2,[MASK])
根据事件类型的概率值大小确定了事件类型为“Life:Divorce”事件类型,事件触发词跨度由模型自然生成“divorce”。
最终输出序列文本为“The trigger type is Life:Divorce,trigger token is divorce.”。
将针对事件论元抽取的提取模板以文本序列的形式输入到标准的预先训练的编码-解码器Transformer框架,对事件论元角色类型及事件论元跨度进行预测。以图4所示的事件论元抽取和事件角色分类的模型结构图为例进行说明,模型编码器输入序列为“divorce time place person fine partners injure victim agent place bodypart transfer money giver recipient money beneficiary.The divorce settlement called for Giuliani to pay Hanover more than$6.8million.”模型解码器输出监督序列为“The argument type is[MASK],argument token is”。
事件论元的角色类型预测“[MASK]”位置每个角色类型的概率输出,并根据概率值倒序排列:
p(Person|s 1,s 2,[MASK])>p(Time|s 1,s 2,[MASK])
根据角色类型的概率值大小确定了角色类型为“Person”,对应的事件论元跨度由模型自然生成“Giuliani Hanover”。
最终输出序列文本为“The argument type is Person,argument token is Giuliani Hanover.”。
步骤6,对预测得到的事件类型、角色类型进行规范化处理。
实施例中,滤除低于概率阈值的类型后,进行事件类型与事件类型序号的映射,进行角色类型与角色类型序号的映射。
步骤7,对预测得到的事件触发词的跨度、事件论元跨度进行规范化处理。
实施例中,滤除低于概率阈值的跨度后,进行事件触发词跨度与事件触发词跨度标签的映射,进行事件论元跨度与事件论元跨度标签的映射。
步骤8,整合事件触发词与事件论元的类型序号、跨度标签,传入结构化数据库存储。
例如实施例中抽取出了两个事件,一个事件是“Life:Divorce”事件类型,类型序号为14,事件触发词是“divorce”,跨度标签为(2,3),由一个事件参数组成:论元跨度为“Giuliani”,跨度标签为(6,7),角色类型序号为“Person”,角色标签为35。另一个事件是“Transaction:Transfer-Money”事件类型,类型序号为20,触发词是“pay”,跨度标签为(8,9),由三个事件参数组成:论元跨度为“Giuliani”,跨度标签为(6,7),角色类型为“Giver”,角色类型序号为45;论元跨度为“$6.8million”,跨度标签为(12,14),角色类型为“Money”,角色类型序号为46;论元跨度为“Hanover”,跨度标签为(9,10),角色类型为“Recipient”,角色类型序号为47;最终将结构性事件抽取结果传入结构化数据库存储。
上述实施例提供的基于本体指导的生成式事件抽取方法,可以用于互联网新闻资讯的结构化整理,自动抽取出新闻事件名及其关联的新闻事件论元,可用于新闻定向推荐、关联文档整理、基于热点的搜索查询等下游场景,以提升新闻事件提取的速度和准确性。
上述实施例提供的基于本体指导的生成式事件抽取方法,对于将复杂的输入事件文本进行结构化知识抽取,利用自然语言处理相关技术,使用端到端语言生成模型作为事件提取模型,结合事件本体知识指导事件抽取, 提高在全监督和少样本场景下的模型性能,为高效抽取公开的语料信息提供了更好的解决方法。
上述实施例提供的基于本体指导的生成式事件抽取方法,针对新事件类型的复杂模型架构和新事件类型的泛化能力,本发明引入了一个事件本体库,该库可以通过及时构建注入事件知识,有着更快的模型收敛速度,以及在少样本情况下性能提高显著。通过针对事件本体知识构建合适的模板和高效融入事件本体知识,本发明能够在全监督和少样本场景下高效地抽取事件结构化知识,具有一定的工业实用价值。该方法在预先训练的语言模型中已有的参数之外,不需要任何额外的参数,因此实现方式简单灵活。
以上所述的具体实施方式对本发明的技术方案和有益效果进行了详细说明,应理解的是以上所述仅为本发明的最优选实施例,并不用于限制本发明,凡在本发明的原则范围内所做的任何修改、补充和等同替换等,均应包含在本发明的保护范围之内。

Claims (9)

  1. 一种基于本体指导的生成式事件抽取方法,其特征在于,包括以下步骤:
    步骤1,根据领域知识库和事件标注框架构建事件本体知识库;
    步骤2,设计生成式事件抽取的事件触发词提取模板和事件论元提取模板;事件触发词提取模板将输入事件文本映射为事件提取模型的第一输入序列;事件论元提取模板将融入事件本体的输入事件文本映射为事件提取模型的第二输入序列;
    步骤3,设计类标映射函数,该映类标射函数处理多单词标签到事件类型和/或角色类型的映射;
    步骤4,对于输入事件文本,从事件本体知识库中提取与输入事件对应的事件本体,并根据输入事件文本和事件本体,按照事件触发词提取模板和事件论元提取模板构建第一输入序列和第二输入序列;
    步骤5,第一输入序列和第二输入序列输入事件提取模型,事件提取模型根据类标映射函数和自身处理机制预测事件类型和角色类型,同时输出事件触发词跨度和事件论元跨度。
  2. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,步骤1中,构建事件本体知识库的过程为:
    步骤1.1,利用ACE预定义的事件框架作为目标事件本体;
    步骤1.2,抽取FrameNet中与目标事件本体相关的事件框架作为扩充的事件本体;
    步骤1.3,整合目标事件本体和扩充的事件本体,进行去重和人工检查,得到事件本体知识库。
  3. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,步骤2中,设计的事件触发词提取模板为:
    [第一标记符]<伪模板><输入事件文本>[第二标记符]事件触发词为[MASK],触发词令牌为,对应英文为[CLS]<pseudo template><input sentence>[SOS]The trigger word is[MASK],trigger token is;
    其中,伪模板采用预训练词嵌入中未使用的虚拟伪标签。
  4. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,步骤2中,设计的事件论元提取模板为:
    [第一标记符]<事件本体><输入事件文本>[第二标记符]论元类型为[MASK],论元令牌为,对应英文为[CLS]<Event ontology><input sentence>[SOS]The argument type is[MASK],argument token is。
  5. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,步骤3中,设计的类标映射函数为:
    Y(r i)={w 1,w 2,...,w n}
    在进行事件类型预测时,Y(r i)表示第i个事件类型与多单词标签的映射函数,w n表示事件类型的第n个词汇标签的词嵌入向量;
    在进行角色类型预测时,Y(r i)表示第i个角色类型与多单词标签的映射函数,w n表示角色类型的第n个词汇标签的词嵌入向量。
  6. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,步骤5中,事件提取模型采用编码-解码器Transformer框架;
    利用事件提取模型在进行事件类型和/或角色类型预测时,采用以下公式获得事件类型和/或角色类型的预测概率:
    Figure PCTCN2022120840-appb-100001
    其中,p(r i)表示第r i个事件类型/角色标签r i的预测概率,h [MASK]表示事件提取模型在[MASK]位置对应的输出向量,w表示目标事件类型/角色类型的词汇标签的词嵌入向量,w'表示所有事件类型/角色类型的词汇标签的词嵌入向量,R表示所有事件类型/角色标签集合;
    利用事件提取模型进行事件触发词跨度和/或事件论元跨度预测时,将事件触发词跨度和/或事件论元跨度预测建模为一个序列生成任务,对于事件文本集S,输入事件文本A,关联的事件本体O,通过训练Z=E s(A,O)与H=G s(z)学习融合事件本体的条件分布
    Figure PCTCN2022120840-appb-100002
    其中Z是在编码器E中通过学习得到的潜在源域表示,H表示在解码器G中通过学习得到的潜在源域表示,
    Figure PCTCN2022120840-appb-100003
    Figure PCTCN2022120840-appb-100004
    表示源域中编码器和解码器的模型参数集,
    Figure PCTCN2022120840-appb-100005
    表示给定输入事件文本A和关联的事件本体O生成输出序列H的总体概率,其中,
    Figure PCTCN2022120840-appb-100006
  7. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,利用基于事件触发词提取模板和事件论元提取模板构建的第一输入序列和第二输入序列对事件提取模型进行参数微调,利用微调后的事件提取模型进行预测任务。
  8. 根据权利要求1所述的基于本体指导的生成式事件抽取方法,其特征在于,所述生成式事件抽取方法还包括:
    步骤6,对步骤5预测得到的事件类型、角色类型进行规范化处理,滤除低于概率阈值的类型后,进行事件类型与事件类型序号的映射,进行 角色类型与角色类型序号的映射。
  9. 根据权利要求1或8所述的基于本体指导的生成式事件抽取方法,其特征在于,所述生成式事件抽取方法还包括:
    步骤7,对步骤5预测得到的事件触发词跨度、事件论元跨度进行规范化处理,滤除低于概率阈值的跨度后,进行事件触发词跨度与事件触发词跨度标签的映射,进行事件论元跨度与事件论元跨度标签的映射;
    步骤8,整合事件触发词与事件论元的类型序号、跨度标签,传入结构化数据库存储。
PCT/CN2022/120840 2021-09-28 2022-09-23 一种基于本体指导的生成式事件抽取方法 WO2023051399A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/280,655 US20240143633A1 (en) 2021-09-28 2021-09-23 Generative event extraction method based on ontology guidance

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111142014.3A CN113987104B (zh) 2021-09-28 2021-09-28 一种基于本体指导的生成式事件抽取方法
CN202111142014.3 2021-09-28

Publications (1)

Publication Number Publication Date
WO2023051399A1 true WO2023051399A1 (zh) 2023-04-06

Family

ID=79737025

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/120840 WO2023051399A1 (zh) 2021-09-28 2022-09-23 一种基于本体指导的生成式事件抽取方法

Country Status (3)

Country Link
US (1) US20240143633A1 (zh)
CN (1) CN113987104B (zh)
WO (1) WO2023051399A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501898A (zh) * 2023-06-29 2023-07-28 之江实验室 适用于少样本和有偏数据的金融文本事件抽取方法和装置
CN116861901A (zh) * 2023-07-04 2023-10-10 广东外语外贸大学 一种基于多任务学习的中文事件检测方法、系统和电子设备
CN117390175A (zh) * 2023-12-13 2024-01-12 临沂大学 基于bert的智能家居使用事件抽取方法
CN117473093A (zh) * 2023-12-25 2024-01-30 中科雨辰科技有限公司 一种基于llm模型获取目标事件的数据处理系统
CN117493486A (zh) * 2023-11-10 2024-02-02 华泰证券股份有限公司 基于数据重放的可持续金融事件抽取系统及方法
CN117910473A (zh) * 2024-03-19 2024-04-19 北京邮电大学 融合实体类型信息的事件论元抽取方法及相关设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240143633A1 (en) * 2021-09-28 2024-05-02 Zhejiang University Generative event extraction method based on ontology guidance
CN114239536B (zh) * 2022-02-22 2022-06-21 北京澜舟科技有限公司 一种事件抽取方法、系统及计算机可读存储介质
CN114936563B (zh) * 2022-04-27 2023-07-25 苏州大学 一种事件抽取方法、装置及存储介质
CN115238045B (zh) * 2022-09-21 2023-01-24 北京澜舟科技有限公司 一种生成式事件论元抽取方法、系统及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599032A (zh) * 2016-10-27 2017-04-26 浙江大学 一种结合稀疏编码和结构感知机的文本事件抽取方法
WO2020001373A1 (zh) * 2018-06-26 2020-01-02 杭州海康威视数字技术股份有限公司 一种本体构建方法及装置
CN113312916A (zh) * 2021-05-28 2021-08-27 北京航空航天大学 基于触发词语态学习的金融文本事件抽取方法及装置
CN113420552A (zh) * 2021-07-13 2021-09-21 华中师范大学 一种基于强化学习的生物医学多事件抽取方法
CN113987104A (zh) * 2021-09-28 2022-01-28 浙江大学 一种基于本体指导的生成式事件抽取方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8527262B2 (en) * 2007-06-22 2013-09-03 International Business Machines Corporation Systems and methods for automatic semantic role labeling of high morphological text for natural language processing applications
US20170277996A1 (en) * 2016-03-25 2017-09-28 TripleDip, LLC Computer implemented event prediction in narrative data sequences using semiotic analysis
CN112655047B (zh) * 2018-09-05 2024-05-28 皇家飞利浦有限公司 对医学记录分类的方法
CN112116075B (zh) * 2020-09-18 2023-11-24 厦门安胜网络科技有限公司 事件提取模型生成方法和装置、文本事件提取方法和装置
CN112528676B (zh) * 2020-12-18 2022-07-08 南开大学 文档级别的事件论元抽取方法
CN112861527A (zh) * 2021-03-17 2021-05-28 合肥讯飞数码科技有限公司 一种事件抽取方法、装置、设备及存储介质
CN113312464B (zh) * 2021-05-28 2022-05-31 北京航空航天大学 一种基于对话状态追踪技术的事件抽取方法
CN113255321B (zh) * 2021-06-10 2021-10-29 之江实验室 基于文章实体词依赖关系的金融领域篇章级事件抽取方法
CN113656805B (zh) * 2021-07-22 2023-06-20 扬州大学 一种面向多源漏洞信息的事件图谱自动构建方法及系统
CN113779358B (zh) * 2021-09-14 2024-05-24 支付宝(杭州)信息技术有限公司 一种事件检测方法和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599032A (zh) * 2016-10-27 2017-04-26 浙江大学 一种结合稀疏编码和结构感知机的文本事件抽取方法
WO2020001373A1 (zh) * 2018-06-26 2020-01-02 杭州海康威视数字技术股份有限公司 一种本体构建方法及装置
CN113312916A (zh) * 2021-05-28 2021-08-27 北京航空航天大学 基于触发词语态学习的金融文本事件抽取方法及装置
CN113420552A (zh) * 2021-07-13 2021-09-21 华中师范大学 一种基于强化学习的生物医学多事件抽取方法
CN113987104A (zh) * 2021-09-28 2022-01-28 浙江大学 一种基于本体指导的生成式事件抽取方法

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501898A (zh) * 2023-06-29 2023-07-28 之江实验室 适用于少样本和有偏数据的金融文本事件抽取方法和装置
CN116501898B (zh) * 2023-06-29 2023-09-01 之江实验室 适用于少样本和有偏数据的金融文本事件抽取方法和装置
CN116861901A (zh) * 2023-07-04 2023-10-10 广东外语外贸大学 一种基于多任务学习的中文事件检测方法、系统和电子设备
CN116861901B (zh) * 2023-07-04 2024-04-09 广东外语外贸大学 一种基于多任务学习的中文事件检测方法、系统和电子设备
CN117493486A (zh) * 2023-11-10 2024-02-02 华泰证券股份有限公司 基于数据重放的可持续金融事件抽取系统及方法
CN117390175A (zh) * 2023-12-13 2024-01-12 临沂大学 基于bert的智能家居使用事件抽取方法
CN117390175B (zh) * 2023-12-13 2024-03-12 临沂大学 基于bert的智能家居使用事件抽取方法
CN117473093A (zh) * 2023-12-25 2024-01-30 中科雨辰科技有限公司 一种基于llm模型获取目标事件的数据处理系统
CN117473093B (zh) * 2023-12-25 2024-04-12 中科雨辰科技有限公司 一种基于llm模型获取目标事件的数据处理系统
CN117910473A (zh) * 2024-03-19 2024-04-19 北京邮电大学 融合实体类型信息的事件论元抽取方法及相关设备

Also Published As

Publication number Publication date
US20240143633A1 (en) 2024-05-02
CN113987104A (zh) 2022-01-28
CN113987104B (zh) 2024-06-21

Similar Documents

Publication Publication Date Title
WO2023051399A1 (zh) 一种基于本体指导的生成式事件抽取方法
Mao et al. MetaPro: A computational metaphor processing model for text pre-processing
McDonald et al. Discriminative learning and spanning tree algorithms for dependency parsing
De Silva Survey on publicly available sinhala natural language processing tools and research
Wu et al. Community answer generation based on knowledge graph
Li et al. A survey of discourse parsing
Lata et al. Mention detection in coreference resolution: survey
Mezghanni et al. Deriving ontological semantic relations between Arabic compound nouns concepts
Kanev et al. Metagraph knowledge base and natural language processing pipeline for event extraction and time concept analysis
CN112800244A (zh) 一种中医药及民族医药知识图谱的构建方法
Zouaq et al. Semantic analysis using dependency-based grammars and upper-level ontologies.
Lee Natural Language Processing: A Textbook with Python Implementation
Seifossadat et al. Stochastic Data-to-Text Generation Using Syntactic Dependency Information
He et al. [Retracted] Application of Grammar Error Detection Method for English Composition Based on Machine Learning
Costa Processing Temporal Information in Unstructured Documents
Bindu et al. Design and development of a named entity based question answering system for Malayalam language
Özateş et al. A Hybrid Deep Dependency Parsing Approach Enhanced With Rules and Morphology: A Case Study for Turkish
Meng et al. Intelligent algorithms of English semantic analysis based on deep learning technology
Popa et al. Towards syntax-aware token embeddings
Le-Hong et al. Vietnamese semantic role labelling
Worke INFORMATION EXTRACTION MODEL FROM GE’EZ TEXTS
Grella Notes about a more aware dependency parser
Tomeh Discriminative Alignment Models For Statistical Machine Translation
Vaghasia An Improvised Approach of Deep Learning Neural Networks in NLP Applications
Suster Empirical studies on word representations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22874785

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18280655

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE