WO2018218707A1 - Neural network and attention mechanism-based information relation extraction method - Google Patents

Neural network and attention mechanism-based information relation extraction method Download PDF

Info

Publication number
WO2018218707A1
WO2018218707A1 PCT/CN2017/089137 CN2017089137W WO2018218707A1 WO 2018218707 A1 WO2018218707 A1 WO 2018218707A1 CN 2017089137 W CN2017089137 W CN 2017089137W WO 2018218707 A1 WO2018218707 A1 WO 2018218707A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
information
intelligence
training
relationship
Prior art date
Application number
PCT/CN2017/089137
Other languages
French (fr)
Chinese (zh)
Inventor
刘兵
周勇
张润岩
王重秋
Original Assignee
中国矿业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国矿业大学 filed Critical 中国矿业大学
Publication of WO2018218707A1 publication Critical patent/WO2018218707A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the invention relates to the field of cyclic neural network, natural language processing and intelligence analysis combined with attention mechanism, in particular to a method for extracting intelligence relations by using a bidirectional cyclic neural network combining attention mechanism.
  • the classification of intelligence is based on the standard knowledge framework or model paradigm, that is, the domain experts extract key features of intelligence, organize the expression forms of various relationship categories, and build a knowledge base to complete the relationship classification.
  • the information analysis system of patent CN201410487829.9 based on the standard knowledge framework, uses computer to accumulate knowledge, integrate scattered information, comprehensive historical information to complete the identification of intelligence relations, and finally provide the thinking brain map of command decision-making and assist decision-making.
  • the information association processing method of patent CN201610015796 based on the domain knowledge model, extracts the feature vocabulary by means of naming body recognition and domain dictionary, and trains the topic relevance degree of the feature words with the topic map model, thereby establishing the topic template of the event, and completing the template by using the template. Relevance judgment of intelligence.
  • the patent CN201610532802.6, the patent CN201610393749.6 and the patent CN201610685532.2 respectively use a multi-layer convolutional neural network, a convolutional neural network combined with distance supervision, and a convolutional neural network combined with attention to perform relationship extraction.
  • the following methods are mainly used for the relationship extraction method of intelligence:
  • the intelligence analysis based on the knowledge framework or model requires a large number of historical cases with wide coverage, and requires domain experts with professional knowledge to build the knowledge base. That is, the workload is large and the completed framework may have weak generalization ability.
  • the neural network-based method mostly stays on the theoretical method research, and needs some adjustment in practical applications, and now more convolutional nerves are used. The network has a poor effect on the grasp of the whole sentence context. The accuracy of the non-special processing is not as good as that of the bi-directional RNN.
  • the present invention provides an intelligent, accurate, and effective presentation of an intelligence relationship extraction method.
  • An information relationship extraction method based on neural network and attention mechanism includes the following steps:
  • Step 1) Construct a user dictionary, and the neural network system has an initial user dictionary.
  • Step 2) training the word vector, extracting the text data from the database related to the field, and using the user dictionary training word vector library obtained in step 1), mapping the text words in the text data into numerical vector data;
  • Step 3) construct a training set, extract an intelligence pair from the historical intelligence database, and convert each pair of information into an intelligence relationship triplet training data ⁇ intelligence 1, intelligence 2, relationship> using the word vector library obtained in step 2);
  • Step 4) corpus preprocessing first using the user dictionary obtained in step 1) to perform corpus preprocessing on the training data obtained in step 3), namely word segmentation and naming body recognition; word segmentation and naming body recognition are implemented using existing automation tools,
  • the final result of processing is to transform each piece of information into a vector of words of behavioral words, a matrix of information words listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two;
  • Step 5) training the neural network model, adding the matrix obtained in step 4) to the neural network for training, and obtaining a relation extraction neural network model; wherein the neural network training method comprises the following steps:
  • Step 5-1) Input the information word matrix into the bidirectional length and time memory network Bi-LSTM unit to extract the information of the integrated context, and input the positive sequence statement and the reverse order statement into the two long and short time memory network LSTM units respectively; when calculating the current time, Iteratively considers the role of the last moment; the combined expression of the hidden layer calculation and feature extraction of the LSTM unit is as follows:
  • x t represents the information word matrix obtained in step 4) at time t, and is also the input matrix of the neural network;
  • i t represents the output result of the input gate at time t
  • f t represents the output result of the forgetting gate at time t
  • g t represents the output of the integrated input at time t
  • c t and c t-1 respectively represent memory flow states at time t and time t-1;
  • o t represents the output of the output gate at time t
  • h t and h t-1 respectively represent hidden layer information at time t and t-1, that is, feature output extracted by neural network;
  • ⁇ () represents the sigmoid activation function
  • tanh() represents the hyperbolic tangent activation function
  • W xi , W hi , W ci , etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part of the subordinate;
  • b i , b f , etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part;
  • the parameters W xi , W hi , W ci , b i , b f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
  • Step 5-2 weighting the two long and short time memory network LSTM unit outputs of the positive sequence statement and the reverse order statement as the final output of the neural network;
  • h fw represents the output of the LSTM network processing the positive sequence statement, and W fw represents its corresponding weight to be trained;
  • h bw represents the output of the LSTM network that processes the reverse statement, and W bw represents the corresponding weight to be trained;
  • o final represents the final output of the neural network
  • weights W fw and W bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
  • Step 5-3) Calculate the attention distribution of the whole sentence according to the neural network output of the corresponding position of the named body, and output according to the whole sentence of the allocation combined neural network, and the formula is as follows:
  • is the attention distribution matrix
  • r is the output of the targeted integration of the intelligence statement
  • E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix
  • O final is the output of the cyclic neural network, which is shaped like [o 1 , o 2 , o 3 ... o n ], where o 1 , o 2 , o 3 ... o n are the outputs of the corresponding nodes of the neural network, and n is the word of intelligence.
  • W a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function;
  • the weight W a to be trained here is also randomly initialized first, and then automatically corrected during the training, and finally Will get the final value with the training of the neural network;
  • Step 5-4) For the feature information r of the two intelligences, input the fully connected layer after splicing, and finally use the softmax classifier to classify the relationship, and use the gradient descent method to train the obtained prediction result;
  • Step 6 Information acquisition, input two sets of text information, a batch can have multiple groups, wherein the text information is a clear text in the center, if it is new information, you can choose to expand the user dictionary obtained in step 1). ;
  • Step 7) text preprocessing, through the trained word segmentation tool in step 4), the word vector library and step obtained in step 2)
  • the character recognition tool used in step 4) converts the text information of the original whole sentence in step 6) into an information value matrix; wherein each line is a vector representation of each word, and a matrix represents an intelligence, and is marked with The location of the named body;
  • Step 9) incrementally updating, determining whether the relationship type of each group of information obtained in step 8) is correct or not, and if the judgment is correct, performing visual display in combination with the information acquired in step 6) and the corresponding relationship category, if the judgment is wrong, Select to add the correctly determined intelligence relationship triplet training data to the training set in step 3), repeat steps 4) and 5), and retrain the modified neural network model.
  • the optional solution in step 1) is to construct a professional domain user dictionary, which refers to a proper noun in a specific field and is difficult to recognize from the field; other common words can be automatically recognized;
  • the proprietary vocabulary can be selected from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only needs to add the known proprietary vocabulary to the user dictionary of the neural network system.
  • the training set is constructed by extracting sufficient intelligence from the historical intelligence database and constructing the intelligence relationship triplet training data, requiring more than 5,000; specifically determining the relationship category first, the relationship category including the antecedents and consequences, the theme and the details Description, location contact, time contact, according to different relationships, the information is divided into three groups such as ⁇ intelligence 1, intelligence 2, relationship>.
  • the text data is extracted from the database related to the domain, combined with the text corpus of the network encyclopedia and the news broadcast, and the word vocabulary is trained by the Google toolkit word2vector to map the text vocabulary into numerical vector data, and the vector data includes the original Semantic information to complete the conversion of natural language to numerical representation.
  • Chinese is semantically a word unit.
  • word segmentation processing is required first; in the process of word segmentation, the professional domain user dictionary is added.
  • the information in the step of acquiring information should be a center-defined text within a short period of 100 words; the relationship extraction is for a binary relationship, that is, the processing object is a pair of information, so the input of the LSTM unit of the long and short memory network should be two A group of textual intelligence.
  • word segmentation and naming recognition are implemented using existing automated tools such as nlpir and stanford-ner.
  • a user dictionary of the professional domain is used when the automated tool identifies the word segmentation and the naming body.
  • the invention has the following beneficial effects:
  • the invention uses a bidirectional cyclic neural network, combines the naming entity to pay attention to each word in the intelligence, extracts the feature information in the word vector representation of the intelligence, and further classifies the extracted feature information by using the softmax classifier, thereby completing the relationship of the intelligence. Extract the task.
  • Two-way cyclic neural network has powerful feature extraction ability on text data, which can overcome the traditional The problem of large amount of artificial feature extraction in the knowledge base method and the weak generalization ability caused by subjectivity; the use of two-way long-term memory network can effectively consider the complete context information, and the attention weight of the named entity can be based on these narrative centers.
  • the word automatically assigns the importance of each word in the intelligence, which makes the relationship extraction method of the present invention have higher accuracy than other neural network methods.
  • FIG. 1 is a flow chart of a method for extracting an intelligence relationship based on a neural network and an attention mechanism according to the present invention.
  • FIG. 2 is a schematic diagram of a bidirectional cyclic neural network used in an information relation extraction method based on a neural network and an attention mechanism according to the present invention.
  • FIG. 3 is a schematic diagram of an attention mechanism used in an information relationship extraction method based on a neural network and an attention mechanism according to the present invention.
  • an information relationship extraction method based on neural network and attention mechanism is divided into two phases: training phase and application phase.
  • the system needs to first construct a user dictionary (optional), a training word vector, and then construct a training set from the historical intelligence database, perform corpus preprocessing, and finally perform training on the relationship extraction neural network model.
  • the neural network system has an initial user dictionary, and the vocabulary is extracted from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only needs to add the known proprietary vocabulary to the neural network.
  • a user dictionary of the network system can build a dictionary of proprietary vocabulary users. Professional domain user lexicon refers to proper nouns in a particular field and is difficult to identify from the field; other common vocabulary can be automatically identified;
  • Training word vector extract text data from the database related to the domain, combine the text corpus with network encyclopedia, news broadcast, etc., and use the user dictionary obtained in step (a) a) to train the word vector library through the Google toolkit word2vector, and the text
  • the vocabulary is mapped into numerical vector data, and the vector data contains the original semantic information, thereby completing the transformation from natural language to numerical representation.
  • step (a) b) Specifically, you need to first determine the relationship category, such as the cause and effect, The theme is related to the detailed description, location, and time. According to different relationships, the information is divided into three groups: ⁇ intelligence 1, intelligence 2, relationship>.
  • Corpus preprocessing firstly use the user dictionary obtained in step a) to perform corpus preprocessing on the triplet training data obtained in step (1) c), namely segmentation and naming recognition, word segmentation and naming recognition using existing Automated tool implementations such as nlpir and stanford-ner. In this process, the user dictionary of the professional field will be used, and finally the accuracy rate of 95% or more will be achieved.
  • the final result of the preprocessing is to convert each piece of information in the triplet training data into an action matrix of the behavior word vector dimension, which is listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two.
  • step (a) d) pre-processed two or two sets of intelligence matrix are subjected to the following neural network training process: the step (a) d) pre-processed information matrix input relationship extraction neural
  • the network is trained. First, the information word matrix is input into the bidirectional long-term memory network Bi-LSTM to extract the information of the comprehensive context.
  • the formula of the LSTM network is as follows:
  • x t represents the matrix obtained in step 4) at time t (corresponding to the input of the t-th word vector), and is also the input matrix of the neural network;
  • i t represents the output result of the input gate at time t (corresponding to the input of the t-th word vector), which determines the proportion of the information recorded by the memory stream;
  • f t represents the output result of the forgetting gate at time t (corresponding to the input of the t-th word vector), which determines the proportion of the memory stream based on the information, forgetting the memory data;
  • g t represents the output of the input integration at time t (corresponding to the t-th word vector input), which integrates the information input this time;
  • c t and c t-1 respectively represent the memory flow state at time t (corresponding to the t-th word vector input) and t-1 time (corresponding to the t-1th word vector input);
  • o t represents the output result of the output gate at time t (corresponding to the t-th word vector input), which determines the proportion of the output data from the memory stream;
  • h t and h t-1 respectively represent the t-time (corresponding to the t-th word vector input) and the t-1 time (corresponding to the t-1th word vector input) hidden layer information, that is, the feature output extracted by the neural network;
  • ⁇ () represents the sigmoid activation function
  • tanh() represents the hyperbolic tangent activation function
  • W xi , W hi , W ci , etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part to which it belongs;
  • b i , b f , etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part.
  • the parameters W xi , W hi , W ci , b i , b f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
  • the specific implementation of the bidirectional cyclic neural network is to train two cyclic neural networks.
  • the inputs are positive sequence statements and reverse order statements.
  • w1, w2, w3... are a series of words (statements).
  • Two neural networks are input in positive and negative order, respectively. After that, the output of the two is spliced as the final output of the neural network, that is, the corresponding formulas of o1, o2, and o3 in the figure are as follows:
  • h fw represents the output of the neural network processing the positive sequence statement, and W fw represents its corresponding weight to be trained;
  • h bw represents the output of the neural network processing the reverse sentence, and W bw represents the corresponding weight to be trained;
  • o final represents the final output of the neural network.
  • weights W fw and W bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
  • the attention distribution of the whole sentence is calculated according to the neural network output of the corresponding position of the naming body, and the whole sentence output according to the allocation combination neural network is as follows:
  • is the attention distribution matrix
  • r is the output of the targeted integration of the intelligence statement
  • E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix
  • O final is the output of the cyclic neural network, which is shaped like [o 1 , o 2 , o 3 ... o n ], where o 1 , o 2 , o 3 ... o n are the outputs of the corresponding nodes of the neural network, and n is the word of intelligence.
  • W a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function;
  • the weight W a to be trained here is also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
  • the splicing is input to the fully connected layer, and finally the softmax classifier is used for the relationship classification, and the obtained prediction result is trained by the gradient descent method;
  • the information relationship extraction method of the present invention includes four steps of intelligence acquisition, text preprocessing, relationship extraction, and incremental update in the application phase:
  • the information should be a clear text in the center of a short paragraph of 100 words.
  • Relationship extraction is for binary relations, that is, the processing object is a pair of intelligence, so the input of the system should be two sets of text information, and one batch can have multiple groups.
  • Figure 1 if it is new information, you can choose to expand the steps (a) a) user dictionary to adapt to the new vocabulary in the new information.
  • step (a) d) trained word segmentation tool
  • step (a) b) word vector library and step (a) d) used in the name recognition tool the steps (two
  • the text information of the original whole sentence of two groups in a) is converted into a numerical matrix, where each line is a vector representation of each word, and a matrix represents an intelligence, and the position of the named body is marked.
  • step (2) b) processed two pairs of information matrix to the input step (a) e) trained relationship extract neural network model, automatic relationship extraction, and finally get each group of information Relationship category.
  • step (2) d incremental update, as shown in Figure 1, the system supports correcting the error judgment, judging the relationship type of each group of information obtained in step (2) c), if the judgment is correct, then combine with step (2) a) The information and the corresponding relationship categories are visualized. If the judgment is wrong, you can choose to add the correctly judged intelligence relationship triplet training data to the training set in step (a)c), repeating steps (a) d) and steps. (a) e), retraining the modified neural network model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to the fields of recurrent neural networks, natural language processing and information analysis combined with attention mechanisms, and provided thereby are a neural network and an attention mechanism-based information relation extraction method, which are used for solving the problems of large workload and low generalization in existing information analysis systems that are mostly based on artificially constructed knowledge bases. The specific implementation of the method comprises a training phase and an application phase. In the training phase, first a user dictionary is constructed and word vectors are trained, a training set is constructed from within a historical information database, a corpus is pre-processed, and then neural network model training is conducted; and in the application phase, information is obtained, the information is pre-processed, an information relation extraction task may be automatically completed while supporting user dictionary expansion and error correction determination, the result of which is added to a training neural network model having an incremental training set. The information relation extraction method can find relationship between pieces of information, provide a basis for event context integration and decision making, and has a wide range of application value.

Description

一种基于神经网络与注意力机制的情报关系提取方法An Information Relationship Extraction Method Based on Neural Network and Attention Mechanism 技术领域Technical field
本发明涉及结合注意力机制的循环神经网络、自然语言处理、情报分析领域,尤其是一种使用结合注意力机制的双向循环神经网络进行情报关系提取的方法。The invention relates to the field of cyclic neural network, natural language processing and intelligence analysis combined with attention mechanism, in particular to a method for extracting intelligence relations by using a bidirectional cyclic neural network combining attention mechanism.
背景技术Background technique
随着信息时代各项技术的发展,信息数据量呈爆炸式增长。如今,情报信息的获取和存储技术较为成熟,而在情报分析、海量情报数据的关键信息抽取等领域,仍需要许多技术改进。情报数据有着主题性强、时效性高、隐含信息丰富等特点。对同一主题下的情报进行关系分析,按时空、因果等关系整合情报,可完成主题事件的描述、多角度分析等任务,并为最终的决策研判提供依据。因此,寻找情报间的关系并整合出事件脉络有着重要的实际意义。With the development of various technologies in the information age, the amount of information data has exploded. Nowadays, the information acquisition and storage technology is relatively mature, and many technical improvements are still needed in the fields of intelligence analysis and key information extraction of massive intelligence data. Intelligence data has the characteristics of strong theme, high timeliness and rich information. Conducting relationship analysis on intelligence under the same theme, integrating intelligence according to time and space, causality, etc., can complete tasks such as description of subject events and multi-angle analysis, and provide a basis for final decision-making. Therefore, it is of great practical significance to find the relationship between intelligence and integrate the context of events.
目前,情报的关系分类多基于标准知识框架或模型范式,即由领域专家提取情报的关键特征、整理情报各关系类别的表述形式、搭建知识库来完成关系分类。专利CN201410487829.9的情报分析系统,基于标准知识框架,利用计算机进行知识积累、整合零散信息,综合历史信息完成情报关联关系的甄别,最终提供指挥决策的思维脑图,辅助决策。专利CN201610015796的情报关联处理方法,基于领域知识模型,通过命名体识别和领域字典的方式提取特征词汇,以主题图模型训练特征词的主题关联度,从而建立事件的主题词模板,以此模板完成情报的关联判断。At present, the classification of intelligence is based on the standard knowledge framework or model paradigm, that is, the domain experts extract key features of intelligence, organize the expression forms of various relationship categories, and build a knowledge base to complete the relationship classification. The information analysis system of patent CN201410487829.9, based on the standard knowledge framework, uses computer to accumulate knowledge, integrate scattered information, comprehensive historical information to complete the identification of intelligence relations, and finally provide the thinking brain map of command decision-making and assist decision-making. The information association processing method of patent CN201610015796, based on the domain knowledge model, extracts the feature vocabulary by means of naming body recognition and domain dictionary, and trains the topic relevance degree of the feature words with the topic map model, thereby establishing the topic template of the event, and completing the template by using the template. Relevance judgment of intelligence.
此外,也有一些研究运用机器学习的神经网络方法进行关系抽取。专利CN201610532802.6、专利CN201610393749.6和专利CN201610685532.2分别使用多层卷积神经网络、结合距离监督的卷积神经网络、结合注意力的卷积神经网络进行关系抽取。In addition, some studies have used neural network methods of machine learning for relationship extraction. The patent CN201610532802.6, the patent CN201610393749.6 and the patent CN201610685532.2 respectively use a multi-layer convolutional neural network, a convolutional neural network combined with distance supervision, and a convolutional neural network combined with attention to perform relationship extraction.
基于上述研究现状,针对情报的关系抽取方法,主要存在以下问题:第一,基于知识框架或模型的情报分析,需要大量且覆盖面广的历史事例,需要富有专业知识的领域专家进行知识库的构建,即工作量大且完成的框架可能泛化能力较弱;第二,基于神经网络的方法多停留在理论方法的研究上,在实际应用中需要一定调整,且现使用较多的卷积神经网络,在整句语境的把握上效果欠佳,不经特殊处理准确率不如双向循环神经网络(Bi-directional RNN)。Based on the above research status, the following methods are mainly used for the relationship extraction method of intelligence: First, the intelligence analysis based on the knowledge framework or model requires a large number of historical cases with wide coverage, and requires domain experts with professional knowledge to build the knowledge base. That is, the workload is large and the completed framework may have weak generalization ability. Secondly, the neural network-based method mostly stays on the theoretical method research, and needs some adjustment in practical applications, and now more convolutional nerves are used. The network has a poor effect on the grasp of the whole sentence context. The accuracy of the non-special processing is not as good as that of the bi-directional RNN.
发明内容Summary of the invention
发明目的:为了克服现有技术中存在的不足,本发明提供一种智能的、准确率高、展示效果好的情报关系提取方法。OBJECT OF THE INVENTION In order to overcome the deficiencies in the prior art, the present invention provides an intelligent, accurate, and effective presentation of an intelligence relationship extraction method.
技术方案:为实现上述目的,本发明采用的技术方案为: Technical Solution: In order to achieve the above object, the technical solution adopted by the present invention is:
一种基于神经网络与注意力机制的情报关系提取方法,包括以下步骤:An information relationship extraction method based on neural network and attention mechanism includes the following steps:
步骤1)构建用户字典,神经网络系统已有初始的用户字典。Step 1) Construct a user dictionary, and the neural network system has an initial user dictionary.
步骤2)训练词向量,从与该领域有关的数据库中提取文本资料,利用步骤1)得到的用户字典训练词向量库,将文本资料中的文本词汇映射成数值化的向量数据;Step 2) training the word vector, extracting the text data from the database related to the field, and using the user dictionary training word vector library obtained in step 1), mapping the text words in the text data into numerical vector data;
步骤3)构造训练集,从历史情报数据库中提取情报对,使用步骤2)中得到的词向量库将每对情报转化为情报关系三元组训练数据<情报1,情报2,关系>;Step 3) construct a training set, extract an intelligence pair from the historical intelligence database, and convert each pair of information into an intelligence relationship triplet training data <intelligence 1, intelligence 2, relationship> using the word vector library obtained in step 2);
步骤4)语料预处理,先利用步骤1)得到的用户字典对步骤3)得到的训练数据进行语料预处理,即分词和命名体识别;分词和命名体识别使用现有的自动化工具实现,预处理最终结果是将每条情报转化为行为词向量维度、列为语句长度的情报词语矩阵,并标注其中命名体位置,情报两两一组;Step 4) corpus preprocessing, first using the user dictionary obtained in step 1) to perform corpus preprocessing on the training data obtained in step 3), namely word segmentation and naming body recognition; word segmentation and naming body recognition are implemented using existing automation tools, The final result of processing is to transform each piece of information into a vector of words of behavioral words, a matrix of information words listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two;
步骤5)神经网络模型训练,将步骤4)得到的矩阵加入神经网络进行训练,得到关系抽取神经网络模型;其中神经网络的训练方法,包括以下步骤:Step 5) training the neural network model, adding the matrix obtained in step 4) to the neural network for training, and obtaining a relation extraction neural network model; wherein the neural network training method comprises the following steps:
步骤5-1)将情报词语矩阵输入双向长短时记忆网络Bi-LSTM单元提取综合语境的信息,分别将正序语句和倒序语句输入两个长短时记忆网络LSTM单元;在计算本时刻时,迭代地考虑上时刻的作用;LSTM单元的隐层计算及特征提取的组合表达式如下:Step 5-1) Input the information word matrix into the bidirectional length and time memory network Bi-LSTM unit to extract the information of the integrated context, and input the positive sequence statement and the reverse order statement into the two long and short time memory network LSTM units respectively; when calculating the current time, Iteratively considers the role of the last moment; the combined expression of the hidden layer calculation and feature extraction of the LSTM unit is as follows:
it=σ(Wxixt+Whiht-1+Wcict-1+bi)i t =σ(W xi x t +W hi h t-1 +W ci c t-1 +b i )
ft=σ(Wxfxt+Whfht-1+Wcfct-1+bf)f t =σ(W xf x t +W hf h t-1 +W cf c t-1 +b f )
gt=tanh(Wxcxt+Whcht-1+Wccct-1+bc)g t =tanh(W xc x t +W hc h t-1 +W cc c t-1 +b c )
ct=itgt+ftct-1 c t =i t g t +f t c t-1
ot=σ(Wxoxt+Whoht-1+Wcoct+bo)o t =σ(W xo x t +W ho h t-1 +W co c t +b o )
ht=ot·tanh(ct)h t =o t ·tanh(c t )
式中:xt表示t时刻步骤4)中得到的情报词语矩阵,也是神经网络的输入矩阵;Where: x t represents the information word matrix obtained in step 4) at time t, and is also the input matrix of the neural network;
it表示t时刻输入门的输出结果;i t represents the output result of the input gate at time t;
ft表示t时刻遗忘门的输出结果;f t represents the output result of the forgetting gate at time t;
gt表示t时刻输入整合的输出结果;g t represents the output of the integrated input at time t;
ct、ct-1分别表示t时刻和t-1时刻记忆流状态;c t and c t-1 respectively represent memory flow states at time t and time t-1;
ot表示t时刻输出门的输出结果;o t represents the output of the output gate at time t;
ht、ht-1分别表示t时刻和t-1时刻隐层信息,即神经网络提取的特征输出;h t and h t-1 respectively represent hidden layer information at time t and t-1, that is, feature output extracted by neural network;
σ()表示sigmoid激活函数,tanh()表示双曲正切激活函数;σ() represents the sigmoid activation function, and tanh() represents the hyperbolic tangent activation function;
Wxi、Whi、Wci等表示待训练的权值参数,其角标前者表示相乘的输入量,后者表示所 属的计算部分;W xi , W hi , W ci , etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part of the subordinate;
bi、bf等表示待训练的偏置参数,其角标表示所属的计算部分;b i , b f , etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part;
这里待训练的参数Wxi、Whi、Wci、bi、bf都是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;The parameters W xi , W hi , W ci , b i , b f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
步骤5-2)加权拼接正序语句和倒序语句的两个长短时记忆网络LSTM单元输出作为神经网络的最终输出;Step 5-2) weighting the two long and short time memory network LSTM unit outputs of the positive sequence statement and the reverse order statement as the final output of the neural network;
ofinal=Wfwhfw+Wbwhbw o final =W fw h fw +W bw h bw
式中,hfw表示处理正序语句的LSTM网络的输出,Wfw表示其对应的待训练的权值;Where h fw represents the output of the LSTM network processing the positive sequence statement, and W fw represents its corresponding weight to be trained;
hbw表示处理倒序语句的LSTM网络的输出,Wbw表示其对应的待训练的权值;h bw represents the output of the LSTM network that processes the reverse statement, and W bw represents the corresponding weight to be trained;
ofinal表示神经网络的最终输出;o final represents the final output of the neural network;
这里待训练的权值Wfw、Wbw也是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;The weights W fw and W bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
步骤5-3)依据命名体对应位置的神经网络输出来计算情报整句话的注意力分配,并按照分配组合神经网络的整句输出,其公式如下:Step 5-3) Calculate the attention distribution of the whole sentence according to the neural network output of the corresponding position of the named body, and output according to the whole sentence of the allocation combined neural network, and the formula is as follows:
α=softmax(tanh(E)·Wa·Ofinal)α=softmax(tanh(E)·W a ·O final )
r=α·Ofinal r=α·O final
式中,α为注意力分配矩阵,r为情报语句经过针对性整合的输出;E为循环神经网络在命名体位置上的输出,使用固定窗口的模式,选取前K重要的命名体拼接成命名体矩阵;Where α is the attention distribution matrix, r is the output of the targeted integration of the intelligence statement; E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix
Ofinal为循环神经网络的输出,形如[o1,o2,o3…on],其中o1,o2,o3…on为神经网络对应节点的输出,n为情报的词语数量;O final is the output of the cyclic neural network, which is shaped like [o 1 , o 2 , o 3 ... o n ], where o 1 , o 2 , o 3 ... o n are the outputs of the corresponding nodes of the neural network, and n is the word of intelligence. Quantity
Wa为待训练的权值矩阵,softmax()为softmax分类器函数,tanh()为双曲正切激活函数;这里待训练的权值Wa也是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;W a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function; the weight W a to be trained here is also randomly initialized first, and then automatically corrected during the training, and finally Will get the final value with the training of the neural network;
步骤5-4)对于两条情报的特征信息r,拼接后输入全连接层,最后使用softmax分类器进行关系分类,对得到的预测结果使用梯度下降法训练权值;Step 5-4) For the feature information r of the two intelligences, input the fully connected layer after splicing, and finally use the softmax classifier to classify the relationship, and use the gradient descent method to train the obtained prediction result;
步骤6)情报获取,输入两条一组的文字情报,一个批次可以有多组,其中文字情报为一段中心明确的文字,若为新情报,则可以选择扩充步骤1)中得到的用户字典;Step 6) Information acquisition, input two sets of text information, a batch can have multiple groups, wherein the text information is a clear text in the center, if it is new information, you can choose to expand the user dictionary obtained in step 1). ;
步骤7)文本预处理,通过步骤4)中训练好的分词工具、步骤2)得到的词向量库和步 骤4)中使用的命名体识别工具,将步骤6)中原始的整句的文字信息转化为情报数值矩阵;其中每行是每个词的向量表示,一个矩阵即表示一条情报,同时标注其中命名体的位置;Step 7) text preprocessing, through the trained word segmentation tool in step 4), the word vector library and step obtained in step 2) The character recognition tool used in step 4) converts the text information of the original whole sentence in step 6) into an information value matrix; wherein each line is a vector representation of each word, and a matrix represents an intelligence, and is marked with The location of the named body;
步骤8)关系抽取,将步骤7)处理好的两两一组的情报矩阵对输入步骤5)训练好的关系抽取神经网络模型,进行自动化的关系抽取,最终得到每组情报的关系类别;得到每组情报关系类别;Step 8) Relationship extraction, extracting the two or two sets of intelligence matrix processed in step 7), extracting the neural network model from the training step of the input step 5), performing automatic relationship extraction, and finally obtaining the relationship category of each group of intelligence; Each group of intelligence relationship categories;
步骤9)增量式更新,判断步骤8)得到的每组情报的关系类别正误,若判断正确,则结合步骤6)中获取的情报和相应的关系类别进行可视化展示,若判断错误,则可以选择将正确判断的情报关系三元组训练数据加入步骤3)中的训练集,重复步骤4)与步骤5),重新训练修正神经网络模型。Step 9) incrementally updating, determining whether the relationship type of each group of information obtained in step 8) is correct or not, and if the judgment is correct, performing visual display in combination with the information acquired in step 6) and the corresponding relationship category, if the judgment is wrong, Select to add the correctly determined intelligence relationship triplet training data to the training set in step 3), repeat steps 4) and 5), and retrain the modified neural network model.
进一步地:步骤1)中可选方案为构建专业领域用户词典,专业领域用户词典指在特定领域的专有名词、且脱离本领域较难识别的词语;其他普遍的词汇可以自动识别;所述专有词汇可从历史情报数据库中选取,若从历史情报数据库中提取的词汇为专有词汇,用户只需将已知的专有词汇加入神经网络系统的用户字典即可。Further, the optional solution in step 1) is to construct a professional domain user dictionary, which refers to a proper noun in a specific field and is difficult to recognize from the field; other common words can be automatically recognized; The proprietary vocabulary can be selected from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only needs to add the known proprietary vocabulary to the user dictionary of the neural network system.
优选的:训练集的构造是从历史情报数据库中提取足量的情报,构建情报关系三元组训练数据,要求5000条以上;具体首先确定关系类别,关系类别包括前因与后果、主题与详述、位置联系、时间联系,按照不同关系,将情报对分成形如<情报1,情报2,关系>的三元组。Preferably, the training set is constructed by extracting sufficient intelligence from the historical intelligence database and constructing the intelligence relationship triplet training data, requiring more than 5,000; specifically determining the relationship category first, the relationship category including the antecedents and consequences, the theme and the details Description, location contact, time contact, according to different relationships, the information is divided into three groups such as <intelligence 1, intelligence 2, relationship>.
优选的:从与领域有关的数据库中提取文本资料,结合网络百科、新闻广播的文本语料,通过Google工具包word2vector训练词向量库,将文本词汇映射成数值化的向量数据,向量数据包含了原语义信息,以此完成自然语言到数值表示的转化。Preferably, the text data is extracted from the database related to the domain, combined with the text corpus of the network encyclopedia and the news broadcast, and the word vocabulary is trained by the Google toolkit word2vector to map the text vocabulary into numerical vector data, and the vector data includes the original Semantic information to complete the conversion of natural language to numerical representation.
优选的:中文在语义上以词为单位,对于整句的输入,需要先进行分词处理;在分词过程中,加入专业领域用户词典。Preferably: Chinese is semantically a word unit. For the input of the whole sentence, word segmentation processing is required first; in the process of word segmentation, the professional domain user dictionary is added.
优选的:获取情报步骤中情报应为一小段100词以内的中心明确的文字;关系抽取针对的是二元关系,即处理对象为一对情报,所以长短时记忆网络LSTM单元的输入应为两条一组的文字情报。Preferably, the information in the step of acquiring information should be a center-defined text within a short period of 100 words; the relationship extraction is for a binary relationship, that is, the processing object is a pair of information, so the input of the LSTM unit of the long and short memory network should be two A group of textual intelligence.
优选的:分词和命名体识别使用现有的自动化工具实现,如nlpir和stanford-ner。Preferred: word segmentation and naming recognition are implemented using existing automated tools such as nlpir and stanford-ner.
优选的:在自动化工具识别分词和命名体时使用专业领域的用户词典。Preferably: a user dictionary of the professional domain is used when the automated tool identifies the word segmentation and the naming body.
本发明相比现有技术,具有以下有益效果:Compared with the prior art, the invention has the following beneficial effects:
本发明使用双向循环神经网络、结合命名实体对情报中各词的注意力分配,在情报的词向量表示中提取出特征信息,使用softmax分类器对提取的特征信息进一步分类,从而完成情报的关系提取任务。双向循环神经网络在文本数据上有强大的特征提取能力,可克服传统 知识库方法中人工特征提取工作量大的问题以及主观性导致的泛化能力弱问题;使用双向长短时记忆网络可以有效地考虑完整语境信息,使用命名实体的注意力权重可依据这些叙事中心词自动分配情报中每个词的重要程度,这使得本发明的关系提取方法较其他神经网络方法有更高的准确率。The invention uses a bidirectional cyclic neural network, combines the naming entity to pay attention to each word in the intelligence, extracts the feature information in the word vector representation of the intelligence, and further classifies the extracted feature information by using the softmax classifier, thereby completing the relationship of the intelligence. Extract the task. Two-way cyclic neural network has powerful feature extraction ability on text data, which can overcome the traditional The problem of large amount of artificial feature extraction in the knowledge base method and the weak generalization ability caused by subjectivity; the use of two-way long-term memory network can effectively consider the complete context information, and the attention weight of the named entity can be based on these narrative centers. The word automatically assigns the importance of each word in the intelligence, which makes the relationship extraction method of the present invention have higher accuracy than other neural network methods.
附图说明DRAWINGS
下面结合附图对本发明的具体实施方式作进一步详细的说明,其中:The specific embodiments of the present invention are further described in detail below with reference to the accompanying drawings, wherein:
图1是本发明一种基于神经网络与注意力机制的的情报关系提取方法的流程图。1 is a flow chart of a method for extracting an intelligence relationship based on a neural network and an attention mechanism according to the present invention.
图2是本发明一种基于神经网络与注意力机制的的情报关系提取方法中采用的双向循环神经网络示意图。2 is a schematic diagram of a bidirectional cyclic neural network used in an information relation extraction method based on a neural network and an attention mechanism according to the present invention.
图3是本发明一种基于神经网络与注意力机制的的情报关系提取方法中采用的注意力机制示意图。FIG. 3 is a schematic diagram of an attention mechanism used in an information relationship extraction method based on a neural network and an attention mechanism according to the present invention.
具体实施方式detailed description
下面结合附图和具体实施例,进一步阐明本发明,应理解这些实例仅用于说明本发明而不用于限制本发明的范围,在阅读了本发明之后,本领域技术人员对本发明的各种等价形式的修改均落于本申请所附权利要求所限定的范围。The invention will be further clarified with reference to the accompanying drawings and specific embodiments, which are to be construed as illustrative only and not to limit the scope of the invention. Modifications in the form of the price are all within the scope defined by the claims appended hereto.
如图1所示为一种基于神经网络与注意力机制的情报关系提取方法,在实现上分为两个阶段:训练阶段、应用阶段。As shown in Figure 1, an information relationship extraction method based on neural network and attention mechanism is divided into two phases: training phase and application phase.
(一)、训练阶段:(1) Training stage:
如图1所示,在训练阶段,系统需首先构建用户字典(可选)、训练词向量,然后从历史情报数据库中构建训练集,进行语料预处理,最后进行关系抽取神经网络模型的训练。As shown in Figure 1, in the training phase, the system needs to first construct a user dictionary (optional), a training word vector, and then construct a training set from the historical intelligence database, perform corpus preprocessing, and finally perform training on the relationship extraction neural network model.
a、构建用户字典:神经网络系统已有初始的用户字典,从历史情报数据库中提取词汇,若从历史情报数据库中提取的词汇为专有词汇,用户只需将已知的专有词汇加入神经网络系统的用户字典即可构建专有词汇用户字典。专业领域用户词典指在特定领域的专有名词、且脱离本领域较难识别的词语;其他普遍的词汇可以自动识别;a. Building a user dictionary: The neural network system has an initial user dictionary, and the vocabulary is extracted from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only needs to add the known proprietary vocabulary to the neural network. A user dictionary of the network system can build a dictionary of proprietary vocabulary users. Professional domain user lexicon refers to proper nouns in a particular field and is difficult to identify from the field; other common vocabulary can be automatically identified;
b、训练词向量:从与领域有关的数据库中提取文本资料,结合网络百科、新闻广播等文本语料,利用步骤(一)a)得到的用户字典通过Google工具包word2vector训练词向量库,将文本词汇映射成数值化的向量数据,向量数据包含了原语义信息,以此完成自然语言到数值表示的转化。b. Training word vector: extract text data from the database related to the domain, combine the text corpus with network encyclopedia, news broadcast, etc., and use the user dictionary obtained in step (a) a) to train the word vector library through the Google toolkit word2vector, and the text The vocabulary is mapped into numerical vector data, and the vector data contains the original semantic information, thereby completing the transformation from natural language to numerical representation.
c、构建训练集:从历史情报数据库中提取5000条以上情报对,使用步骤(一)b)中得到的词向量库构建情报关系三元组训练数据。具体需要首先确定关系类别,如前因与后果、 主题与详述、位置联系、时间联系,按照不同关系,将情报对分成形如<情报1,情报2,关系>的三元组。c. Build a training set: extract more than 5,000 intelligence pairs from the historical intelligence database, and construct the intelligence relationship triplet training data using the word vector library obtained in step (a) b). Specifically, you need to first determine the relationship category, such as the cause and effect, The theme is related to the detailed description, location, and time. According to different relationships, the information is divided into three groups: <intelligence 1, intelligence 2, relationship>.
d、语料预处理:先利用步骤a)得到的用户字典对步骤(一)c)得到的三元组训练数据进行语料预处理,即分词和命名体识别,分词和命名体识别使用现有的自动化工具实现,如nlpir和stanford-ner。在此过程中,将使用专业领域的用户词典,最终可达到95%以上的准确率。预处理最终结果是将三元组训练数据中的每条情报转化为行为词向量维度、列为语句长度的情报矩阵,并标注其中命名体位置,情报两两一组。d. Corpus preprocessing: firstly use the user dictionary obtained in step a) to perform corpus preprocessing on the triplet training data obtained in step (1) c), namely segmentation and naming recognition, word segmentation and naming recognition using existing Automated tool implementations such as nlpir and stanford-ner. In this process, the user dictionary of the professional field will be used, and finally the accuracy rate of 95% or more will be achieved. The final result of the preprocessing is to convert each piece of information in the triplet training data into an action matrix of the behavior word vector dimension, which is listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two.
e、神经网络模型训练:步骤(一)d)预处理后的两两一组的情报矩阵均进行下面的神经网络训练处理:将步骤(一)d)预处理后的情报矩阵输入关系抽取神经网络进行训练。首先将情报词语矩阵输入双向长短时记忆网络Bi-LSTM提取综合语境的信息,LSTM网络的公式如下:e, neural network model training: step (a) d) pre-processed two or two sets of intelligence matrix are subjected to the following neural network training process: the step (a) d) pre-processed information matrix input relationship extraction neural The network is trained. First, the information word matrix is input into the bidirectional long-term memory network Bi-LSTM to extract the information of the comprehensive context. The formula of the LSTM network is as follows:
it=σ(Wxixt+Whiht-1+Wcict-1+bi)i t =σ(W xi x t +W hi h t-1 +W ci c t-1 +b i )
ft=σ(Wxfxt+Whfht-1+Wcfct-1+bf)f t =σ(W xf x t +W hf h t-1 +W cf c t-1 +b f )
gt=tanh(Wxcxt+Whcht-1+Wccct-1+bc)g t =tanh(W xc x t +W hc h t-1 +W cc c t-1 +b c )
ct=itgt+ftct-1 c t =i t g t +f t c t-1
ot=σ(Wxoxt+Whoht-1+Wcoct+bo)o t =σ(W xo x t +W ho h t-1 +W co c t +b o )
ht=ot·tanh(ct)h t =o t ·tanh(c t )
式中:xt表示t时刻(对应第t个词向量输入)步骤4)中得到的矩阵,也是神经网络的输入矩阵;Where: x t represents the matrix obtained in step 4) at time t (corresponding to the input of the t-th word vector), and is also the input matrix of the neural network;
it表示t时刻(对应第t个词向量输入)输入门的输出结果,它决定了记忆流记录本次信息的比重;i t represents the output result of the input gate at time t (corresponding to the input of the t-th word vector), which determines the proportion of the information recorded by the memory stream;
ft表示t时刻(对应第t个词向量输入)遗忘门的输出结果,它决定了记忆流依据本次信息,遗忘记忆数据的比重;f t represents the output result of the forgetting gate at time t (corresponding to the input of the t-th word vector), which determines the proportion of the memory stream based on the information, forgetting the memory data;
gt表示t时刻(对应第t个词向量输入)输入整合的输出结果,它整合了本次输入的信息;g t represents the output of the input integration at time t (corresponding to the t-th word vector input), which integrates the information input this time;
ct、ct-1分别表示t时刻(对应第t个词向量输入)和t-1时刻(对应第t-1个词向量输入)记忆流状态;c t and c t-1 respectively represent the memory flow state at time t (corresponding to the t-th word vector input) and t-1 time (corresponding to the t-1th word vector input);
ot表示t时刻(对应第t个词向量输入)输出门的输出结果,它决定了从记忆流输出数据的比重;o t represents the output result of the output gate at time t (corresponding to the t-th word vector input), which determines the proportion of the output data from the memory stream;
ht、ht-1分别表示t时刻(对应第t个词向量输入)和t-1时刻(对应第t-1个词向量输入) 隐层信息,即神经网络提取的特征输出;h t and h t-1 respectively represent the t-time (corresponding to the t-th word vector input) and the t-1 time (corresponding to the t-1th word vector input) hidden layer information, that is, the feature output extracted by the neural network;
σ()表示sigmoid激活函数,tanh()表示双曲正切激活函数;σ() represents the sigmoid activation function, and tanh() represents the hyperbolic tangent activation function;
Wxi、Whi、Wci等表示待训练的权值参数,其角标前者表示相乘的输入量,后者表示所属的计算部分;W xi , W hi , W ci , etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part to which it belongs;
bi、bf等表示待训练的偏置参数,其角标表示所属的计算部分。b i , b f , etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part.
这里待训练的参数Wxi、Whi、Wci、bi、bf都是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;The parameters W xi , W hi , W ci , b i , b f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
如图2所示,双向循环神经网络的具体实现即训练两个循环神经网络,输入分别为正序语句和倒序语句,图中w1、w2、w3...即为一串词汇(语句),分别以正序和逆序输入两个神经网络。之后拼接两者的输出作为神经网络的最终输出,即图中o1、o2、o3...相应公式如下:As shown in Figure 2, the specific implementation of the bidirectional cyclic neural network is to train two cyclic neural networks. The inputs are positive sequence statements and reverse order statements. In the figure, w1, w2, w3... are a series of words (statements). Two neural networks are input in positive and negative order, respectively. After that, the output of the two is spliced as the final output of the neural network, that is, the corresponding formulas of o1, o2, and o3 in the figure are as follows:
ofinal=Wfwhfw+Wbwhbw o final =W fw h fw +W bw h bw
式中,hfw表示处理正序语句的神经网络的输出,Wfw表示其对应的待训练的权值;Where h fw represents the output of the neural network processing the positive sequence statement, and W fw represents its corresponding weight to be trained;
hbw表示处理倒序语句的神经网络的输出,Wbw表示其对应的待训练的权值;h bw represents the output of the neural network processing the reverse sentence, and W bw represents the corresponding weight to be trained;
ofinal表示神经网络的最终输出。o final represents the final output of the neural network.
这里待训练的权值Wfw、Wbw也是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;The weights W fw and W bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
如图3所示,依据命名体对应位置的神经网络输出来计算情报整句话的注意力分配,并按照分配组合神经网络的整句输出,其公式如下:As shown in FIG. 3, the attention distribution of the whole sentence is calculated according to the neural network output of the corresponding position of the naming body, and the whole sentence output according to the allocation combination neural network is as follows:
α=softmax(tanh(E)·Wa·Ofinal)α=softmax(tanh(E)·W a ·O final )
r=α·Ofinal r=α·O final
式中,α为注意力分配矩阵,r为情报语句经过针对性整合的输出;E为循环神经网络在命名体位置上的输出,使用固定窗口的模式,选取前K重要的命名体拼接成命名体矩阵;Where α is the attention distribution matrix, r is the output of the targeted integration of the intelligence statement; E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix
Ofinal为循环神经网络的输出,形如[o1,o2,o3…on],其中o1,o2,o3…on为神经网络对应节点的输出,n为情报的词语数量;O final is the output of the cyclic neural network, which is shaped like [o 1 , o 2 , o 3 ... o n ], where o 1 , o 2 , o 3 ... o n are the outputs of the corresponding nodes of the neural network, and n is the word of intelligence. Quantity
Wa为待训练的权值矩阵,softmax()为softmax分类器函数,tanh()为双曲正切激活函数;W a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function;
这里待训练的权值Wa也是先随机初始化,然后训练过程中自动修正,最后会随神经网络 的训练得到最终的值;The weight W a to be trained here is also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
对于两条情报的特征信息r,拼接后输入全连接层,最后使用softmax分类器进行关系分类,对得到的预测结果使用梯度下降法训练权值;For the characteristic information r of the two intelligences, the splicing is input to the fully connected layer, and finally the softmax classifier is used for the relationship classification, and the obtained prediction result is trained by the gradient descent method;
(二)、应用阶段:(2) Application stage:
如图1所示,本发明的情报关系抽取方法在应用阶段包括情报获取、文本预处理、关系抽取、增量式更新四步:As shown in FIG. 1, the information relationship extraction method of the present invention includes four steps of intelligence acquisition, text preprocessing, relationship extraction, and incremental update in the application phase:
a、情报获取,情报应为一小段100词以内的中心明确的文字。关系抽取针对的是二元关系,即处理对象为一对情报,所以系统的输入应为两条一组的文字情报,一个批次可以有多组。如图1所示,若为新情报,则可以选择扩充步骤(一)a)用户词典以适应新情报中的新词汇。a. Information acquisition, the information should be a clear text in the center of a short paragraph of 100 words. Relationship extraction is for binary relations, that is, the processing object is a pair of intelligence, so the input of the system should be two sets of text information, and one batch can have multiple groups. As shown in Figure 1, if it is new information, you can choose to expand the steps (a) a) user dictionary to adapt to the new vocabulary in the new information.
b、文本预处理,通过步骤(一)d)中训练好的分词工具、步骤(一)b)得到的词向量库和步骤(一)d)中使用的命名体识别工具,将步骤(二)a)中两条一组的原始的整句的文字信息均转化为数值矩阵,其中每行是每个词的向量表示,一个矩阵即表示一条情报,同时标注其中命名体的位置。b, text preprocessing, through the step (a) d) trained word segmentation tool, step (a) b) word vector library and step (a) d) used in the name recognition tool, the steps (two The text information of the original whole sentence of two groups in a) is converted into a numerical matrix, where each line is a vector representation of each word, and a matrix represents an intelligence, and the position of the named body is marked.
c、关系抽取,将步骤(二)b)处理好的两两一组的情报矩阵对输入步骤(一)e)训练好的关系抽取神经网络模型,进行自动化的关系抽取,最终得到每组情报的关系类别。c. Relationship extraction, step (2) b) processed two pairs of information matrix to the input step (a) e) trained relationship extract neural network model, automatic relationship extraction, and finally get each group of information Relationship category.
d、增量式更新,如图1所示,系统支持纠正错误判断,判断步骤(二)c)得到的每组情报的关系类别正误,若判断正确,则结合步骤(二)a)中获取的情报和相应的关系类别进行可视化展示,若判断错误,则可以选择将正确判断的情报关系三元组训练数据加入步骤(一)c)中的训练集,重复步骤(一)d)与步骤(一)e),重新训练修正神经网络模型。d, incremental update, as shown in Figure 1, the system supports correcting the error judgment, judging the relationship type of each group of information obtained in step (2) c), if the judgment is correct, then combine with step (2) a) The information and the corresponding relationship categories are visualized. If the judgment is wrong, you can choose to add the correctly judged intelligence relationship triplet training data to the training set in step (a)c), repeating steps (a) d) and steps. (a) e), retraining the modified neural network model.
以上所述仅是本发明的优选实施方式,应当指出:对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。 The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims (8)

  1. 一种基于神经网络与注意力机制的情报关系提取方法,其特征在于,包括以下步骤:An information relationship extraction method based on a neural network and a attention mechanism, characterized in that it comprises the following steps:
    步骤1)构建用户字典,神经网络系统已有初始的用户字典。Step 1) Construct a user dictionary, and the neural network system has an initial user dictionary.
    步骤2)训练词向量,从与该领域有关的数据库中提取文本资料,利用步骤1)得到的用户字典训练词向量库,将文本资料中的文本词汇映射成数值化的向量数据;Step 2) training the word vector, extracting the text data from the database related to the field, and using the user dictionary training word vector library obtained in step 1), mapping the text words in the text data into numerical vector data;
    步骤3)构造训练集,从历史情报数据库中提取情报对,使用步骤2)中得到的词向量库将每对情报转化为情报关系三元组训练数据<情报1,情报2,关系>;Step 3) construct a training set, extract an intelligence pair from the historical intelligence database, and convert each pair of information into an intelligence relationship triplet training data <intelligence 1, intelligence 2, relationship> using the word vector library obtained in step 2);
    步骤4)语料预处理,先利用步骤1)得到的用户字典对步骤3)得到的训练数据进行语料预处理,即分词和命名体识别;分词和命名体识别使用现有的自动化工具实现,预处理最终结果是将每条情报转化为行为词向量维度、列为语句长度的情报词语矩阵,并标注其中命名体位置,情报两两一组;Step 4) corpus preprocessing, first using the user dictionary obtained in step 1) to perform corpus preprocessing on the training data obtained in step 3), namely word segmentation and naming body recognition; word segmentation and naming body recognition are implemented using existing automation tools, The final result of processing is to transform each piece of information into a vector of words of behavioral words, a matrix of information words listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two;
    步骤5)神经网络模型训练,将步骤4)得到的矩阵加入神经网络进行训练,得到关系抽取神经网络模型;其中神经网络的训练方法,包括以下步骤:Step 5) training the neural network model, adding the matrix obtained in step 4) to the neural network for training, and obtaining a relation extraction neural network model; wherein the neural network training method comprises the following steps:
    步骤5-1)将情报词语矩阵输入双向长短时记忆网络Bi-LSTM单元提取综合语境的信息,分别将正序语句和倒序语句输入两个长短时记忆网络LSTM单元;在计算本时刻时,迭代地考虑上时刻的作用;LSTM单元的隐层计算及特征提取的组合表达式如下:Step 5-1) Input the information word matrix into the bidirectional length and time memory network Bi-LSTM unit to extract the information of the integrated context, and input the positive sequence statement and the reverse order statement into the two long and short time memory network LSTM units respectively; when calculating the current time, Iteratively considers the role of the last moment; the combined expression of the hidden layer calculation and feature extraction of the LSTM unit is as follows:
    it=σ(Wxixt+Whiht-1+Wcict-1+bi)i t =σ(W xi x t +W hi h t-1 +W ci c t-1 +b i )
    ft=σ(Wxfxt+Whfht-1+Wcfct-1+bf)f t =σ(W xf x t +W hf h t-1 +W cf c t-1 +b f )
    gt=tanh(Wxcxt+Whcht-1+Wccct-1+bc)g t =tanh(W xc x t +W hc h t-1 +W cc c t-1 +b c )
    ct=itgt+ftct-1 c t =i t g t +f t c t-1
    ot=σ(Wxoxt+Whoht-1+Wcoct+bo)o t =σ(W xo x t +W ho h t-1 +W co c t +b o )
    ht=ot·tanh(ct)h t =o t ·tanh(c t )
    式中:xt表示t时刻步骤4)中得到的情报词语矩阵,也是神经网络的输入矩阵;Where: x t represents the information word matrix obtained in step 4) at time t, and is also the input matrix of the neural network;
    it表示t时刻输入门的输出结果;i t represents the output result of the input gate at time t;
    ft表示t时刻遗忘门的输出结果;f t represents the output result of the forgetting gate at time t;
    gt表示t时刻输入整合的输出结果;g t represents the output of the integrated input at time t;
    ct、ct-1分别表示t时刻和t-1时刻记忆流状态;c t and c t-1 respectively represent memory flow states at time t and time t-1;
    ot表示t时刻输出门的输出结果;o t represents the output of the output gate at time t;
    ht、ht-1分别表示t时刻和t-1时刻隐层信息,即神经网络提取的特征输出;h t and h t-1 respectively represent hidden layer information at time t and t-1, that is, feature output extracted by neural network;
    σ()表示sigmoid激活函数,tanh()表示双曲正切激活函数;σ() represents the sigmoid activation function, and tanh() represents the hyperbolic tangent activation function;
    Wxi、Whi、Wci等表示待训练的权值参数,其角标前者表示相乘的输入量,后者表示所 属的计算部分;W xi , W hi , W ci , etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part of the subordinate;
    bi、bf等表示待训练的偏置参数,其角标表示所属的计算部分;b i , b f , etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part;
    这里待训练的参数Wxi、Whi、Wci、bi、bf都是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;The parameters W xi , W hi , W ci , b i , b f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
    步骤5-2)加权拼接正序语句和倒序语句的两个长短时记忆网络LSTM单元输出作为神经网络的最终输出;Step 5-2) weighting the two long and short time memory network LSTM unit outputs of the positive sequence statement and the reverse order statement as the final output of the neural network;
    ofinal=Wfwhfw+Wbwhbw o final =W fw h fw +W bw h bw
    式中,hfw表示处理正序语句的LSTM网络的输出,Wfw表示其对应的待训练的权值;Where h fw represents the output of the LSTM network processing the positive sequence statement, and W fw represents its corresponding weight to be trained;
    hbw表示处理倒序语句的LSTM网络的输出,Wbw表示其对应的待训练的权值;h bw represents the output of the LSTM network that processes the reverse statement, and W bw represents the corresponding weight to be trained;
    ofinal表示神经网络的最终输出;o final represents the final output of the neural network;
    这里待训练的权值Wfw、Wbw也是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;The weights W fw and W bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;
    步骤5-3)依据命名体对应位置的神经网络输出来计算情报整句话的注意力分配,并按照分配组合神经网络的整句输出,其公式如下:Step 5-3) Calculate the attention distribution of the whole sentence according to the neural network output of the corresponding position of the named body, and output according to the whole sentence of the allocation combined neural network, and the formula is as follows:
    α=softmax(tanh(E)·Wa·Ofinal)α=softmax(tanh(E)·W a ·O final )
    r=α·Ofinal r=α·O final
    式中,α为注意力分配矩阵,r为情报语句经过针对性整合的输出;E为循环神经网络在命名体位置上的输出,使用固定窗口的模式,选取前K重要的命名体拼接成命名体矩阵;Ofinal为循环神经网络的输出,形如[o1,o2,o3…on],其中o1,o2,o3…on为神经网络对应节点的输出,n为情报的词语数量;Where α is the attention distribution matrix, r is the output of the targeted integration of the intelligence statement; E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix; O final is the output of the cyclic neural network, such as [o 1 , o 2 , o 3 ... o n ], where o 1 , o 2 , o 3 ... o n are the outputs of the corresponding nodes of the neural network, n is The number of words in intelligence;
    Wa为待训练的权值矩阵,softmax()为softmax分类器函数,tanh()为双曲正切激活函数;这里待训练的权值Wa也是先随机初始化,然后训练过程中自动修正,最后会随神经网络的训练得到最终的值;W a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function; the weight W a to be trained here is also randomly initialized first, and then automatically corrected during the training, and finally Will get the final value with the training of the neural network;
    步骤5-4)对于两条情报的特征信息r,拼接后输入全连接层,最后使用softmax分类器进行关系分类,对得到的预测结果使用梯度下降法训练权值;Step 5-4) For the feature information r of the two intelligences, input the fully connected layer after splicing, and finally use the softmax classifier to classify the relationship, and use the gradient descent method to train the obtained prediction result;
    步骤6)情报获取,输入两条一组的文字情报,一个批次可以有多组,其中文字情报为一段中心明确的文字,若为新情报,则可以选择扩充步骤1)中得到的用户字典;Step 6) Information acquisition, input two sets of text information, a batch can have multiple groups, wherein the text information is a clear text in the center, if it is new information, you can choose to expand the user dictionary obtained in step 1). ;
    步骤7)文本预处理,通过步骤4)中训练好的分词工具、步骤2)得到的词向量库和步 骤4)中使用的命名体识别工具,将步骤6)中原始的整句的文字信息转化为情报数值矩阵;其中每行是每个词的向量表示,一个矩阵即表示一条情报,同时标注其中命名体的位置;Step 7) text preprocessing, through the trained word segmentation tool in step 4), the word vector library and step obtained in step 2) The character recognition tool used in step 4) converts the text information of the original whole sentence in step 6) into an information value matrix; wherein each line is a vector representation of each word, and a matrix represents an intelligence, and is marked with The location of the named body;
    步骤8)关系抽取,将步骤7)处理好的两两一组的情报矩阵对输入步骤5)训练好的关系抽取神经网络模型,进行自动化的关系抽取,最终得到每组情报的关系类别;Step 8) Relationship extraction, extracting the two or two sets of intelligence matrix processed in step 7), extracting the neural network model from the training step of the input step 5), performing automatic relationship extraction, and finally obtaining the relationship category of each group of intelligence;
    步骤9)增量式更新,判断步骤8)得到的每组情报的关系类别正误,若判断正确,则结合步骤6)中获取的情报和相应的关系类别进行可视化展示,若判断错误,则可以选择将正确判断的情报关系三元组训练数据加入步骤3)中的训练集,重复步骤4)与步骤5),重新训练修正神经网络模型。Step 9) incrementally updating, determining whether the relationship type of each group of information obtained in step 8) is correct or not, and if the judgment is correct, performing visual display in combination with the information acquired in step 6) and the corresponding relationship category, if the judgment is wrong, Select to add the correctly determined intelligence relationship triplet training data to the training set in step 3), repeat steps 4) and 5), and retrain the modified neural network model.
  2. 根据权利要求1所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:步骤1)中可选方案为构建专业领域用户词典,专业领域用户词典指在特定领域的专有名词、且脱离本领域较难识别的词语;其他普遍的词汇可以自动识别;所述专有词汇可从历史情报数据库中选取,若从历史情报数据库中提取的词汇为专有词汇,用户只需将已知的专有词汇加入神经网络系统的用户字典即可。The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein: the optional solution in step 1) is to construct a professional domain user dictionary, and the professional domain user dictionary refers to a specific field. Words with nouns that are more difficult to identify in the field; other common vocabulary can be automatically identified; the proprietary vocabulary can be selected from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only A known proprietary vocabulary needs to be added to the user dictionary of the neural network system.
  3. 根据权利要求1所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:训练集的构造是从历史情报数据库中提取足量的情报,构建情报关系三元组训练数据,要求5000条以上;具体首先确定关系类别,关系类别包括前因与后果、主题与详述、位置联系、时间联系,按照不同关系,将情报对分成形如<情报1,情报2,关系>的三元组。The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein the training set is constructed by extracting sufficient information from the historical intelligence database to construct an intelligence relationship triplet training data. More than 5,000 are required; specifically, the relationship category is first determined. The relationship categories include antecedents and consequences, topics and details, location linkages, and time linkages. According to different relationships, intelligence dichotomy is formed as <intelligence 1, intelligence 2, relationship> The triplet.
  4. 根据权利要求1所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:从与领域有关的数据库中提取文本资料,结合网络百科、新闻广播的文本语料,通过Google工具包word2vector训练词向量库,将文本词汇映射成数值化的向量数据,向量数据包含了原语义信息,以此完成自然语言到数值表示的转化。The method for extracting intelligence relationship based on neural network and attention mechanism according to claim 1, wherein: extracting text data from a database related to the domain, combining text corpus of network encyclopedia and news broadcast, and using Google tools The word2vector training word vector library maps the text vocabulary into numerical vector data, and the vector data contains the original semantic information, thereby completing the transformation from natural language to numerical representation.
  5. 根据权利要求1所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:中文在语义上以词为单位,对于整句的输入,需要先进行分词处理;在分词过程中,加入专业领域用户词典。The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein: Chinese is semantically a word unit, and for the input of the entire sentence, word segmentation processing is required first; In, join the professional domain user dictionary.
  6. 根据权利要求1所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:获取情报步骤中情报应为一小段100词以内的中心明确的文字;关系抽取针对的是二元关系,即处理对象为一对情报,所以长短时记忆网络LSTM单元的输入应为两条一组的文字情报。The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein the information in the step of acquiring information should be a center-defined text within a short period of 100 words; the relationship extraction is for two The meta-relationship, that is, the processing object is a pair of intelligence, so the input of the LSTM unit of the long and short memory network should be two sets of text information.
  7. 根据权利要求1所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:分词和命名体识别使用现有的自动化工具实现,如nlpir和stanford-ner。 The method for extracting intelligence relationship based on neural network and attention mechanism according to claim 1, wherein word segmentation and nominal recognition are implemented using existing automated tools, such as nlpir and stanford-ner.
  8. 根据权利要求7所述的一种基于神经网络与注意力机制的情报关系提取方法,其特征在于:在自动化工具识别分词和命名体时使用专业领域的用户词典。 The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 7, wherein a user dictionary of a professional domain is used when the automated tool recognizes the word segmentation and the naming body.
PCT/CN2017/089137 2017-05-27 2017-06-20 Neural network and attention mechanism-based information relation extraction method WO2018218707A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710392030.5 2017-05-27
CN201710392030.5A CN107239446B (en) 2017-05-27 2017-05-27 A kind of intelligence relationship extracting method based on neural network Yu attention mechanism

Publications (1)

Publication Number Publication Date
WO2018218707A1 true WO2018218707A1 (en) 2018-12-06

Family

ID=59984667

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/089137 WO2018218707A1 (en) 2017-05-27 2017-06-20 Neural network and attention mechanism-based information relation extraction method

Country Status (2)

Country Link
CN (1) CN107239446B (en)
WO (1) WO2018218707A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111640424A (en) * 2019-03-01 2020-09-08 北京搜狗科技发展有限公司 Voice recognition method and device and electronic equipment
CN111724876A (en) * 2020-07-21 2020-09-29 四川大学华西医院 System and method for drug delivery and guidance
US10885386B1 (en) 2019-09-16 2021-01-05 The Boeing Company Systems and methods for automatically generating training image sets for an object
US11113570B2 (en) 2019-09-16 2021-09-07 The Boeing Company Systems and methods for automatically generating training image sets for an environment

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710915B (en) 2017-10-26 2021-02-23 华为技术有限公司 Method and device for generating repeated statement
CN108052512B (en) * 2017-11-03 2021-05-11 同济大学 Image description generation method based on depth attention mechanism
CN108109619B (en) * 2017-11-15 2021-07-06 中国科学院自动化研究所 Auditory selection method and device based on memory and attention model
CN108052499B (en) * 2017-11-20 2021-06-11 北京百度网讯科技有限公司 Text error correction method and device based on artificial intelligence and computer readable medium
CN108010514B (en) * 2017-11-20 2021-09-10 四川大学 Voice classification method based on deep neural network
CN107944915B (en) * 2017-11-21 2022-01-18 北京字节跳动网络技术有限公司 Game user behavior analysis method and computer-readable storage medium
CN108133436A (en) * 2017-11-23 2018-06-08 科大讯飞股份有限公司 Automatic method and system of deciding a case
CN108024158A (en) * 2017-11-30 2018-05-11 天津大学 There is supervision video abstraction extraction method using visual attention mechanism
CN108052625B (en) * 2017-12-18 2020-05-19 清华大学 Entity fine classification method
CN108563653B (en) * 2017-12-21 2020-07-31 清华大学 Method and system for constructing knowledge acquisition model in knowledge graph
CN108021916B (en) * 2017-12-31 2018-11-06 南京航空航天大学 Deep learning diabetic retinopathy sorting technique based on attention mechanism
CN108388549B (en) 2018-02-26 2021-02-19 腾讯科技(深圳)有限公司 Information conversion method, information conversion device, storage medium and electronic device
CN108491680A (en) * 2018-03-07 2018-09-04 安庆师范大学 Drug relationship abstracting method based on residual error network and attention mechanism
CN108628823B (en) * 2018-03-14 2022-07-01 中山大学 Named entity recognition method combining attention mechanism and multi-task collaborative training
CN108536754A (en) * 2018-03-14 2018-09-14 四川大学 Electronic health record entity relation extraction method based on BLSTM and attention mechanism
CN108415819B (en) * 2018-03-15 2021-05-25 中国人民解放军国防科技大学 Hard disk fault tracking method and device
CN110276066B (en) * 2018-03-16 2021-07-27 北京国双科技有限公司 Entity association relation analysis method and related device
CN108519890B (en) * 2018-04-08 2021-07-20 武汉大学 Robust code abstract generation method based on self-attention mechanism
CN108595601A (en) * 2018-04-20 2018-09-28 福州大学 A kind of long text sentiment analysis method incorporating Attention mechanism
CN108681562B (en) * 2018-04-26 2019-10-29 第四范式(北京)技术有限公司 Category classification method and system and Classification Neural training method and device
CN108763542A (en) * 2018-05-31 2018-11-06 中国华戎科技集团有限公司 A kind of Text Intelligence sorting technique, device and computer equipment based on combination learning
CN108882111A (en) * 2018-06-01 2018-11-23 四川斐讯信息技术有限公司 A kind of exchange method and system based on intelligent sound box
CN109243616A (en) * 2018-06-29 2019-01-18 东华大学 Mammary gland electronic health record joint Relation extraction and architectural system based on deep learning
CN109086269B (en) * 2018-07-19 2020-08-21 大连理工大学 Semantic bilingual recognition method based on semantic resource word representation and collocation relationship
CN109165381A (en) * 2018-08-03 2019-01-08 史杰 A kind of text AI Emotion identification system and its recognition methods
CN109271494B (en) * 2018-08-10 2021-04-27 西安交通大学 System for automatically extracting focus of Chinese question and answer sentences
CN109359297B (en) * 2018-09-20 2020-06-09 清华大学 Relationship extraction method and system
CN109376250A (en) * 2018-09-27 2019-02-22 中山大学 Entity relationship based on intensified learning combines abstracting method
CN109446328A (en) * 2018-11-02 2019-03-08 成都四方伟业软件股份有限公司 A kind of text recognition method, device and its storage medium
CN109614614B (en) * 2018-12-03 2021-04-02 焦点科技股份有限公司 BILSTM-CRF product name identification method based on self-attention
CN109615006B (en) * 2018-12-10 2021-08-17 北京市商汤科技开发有限公司 Character recognition method and device, electronic equipment and storage medium
CN109783618B (en) * 2018-12-11 2021-01-19 北京大学 Attention mechanism neural network-based drug entity relationship extraction method and system
CN111312349A (en) * 2018-12-11 2020-06-19 深圳先进技术研究院 Medical record data prediction method and device and electronic equipment
CN111382276B (en) * 2018-12-29 2023-06-20 中国科学院信息工程研究所 Event development context graph generation method
CN109740160B (en) * 2018-12-31 2022-11-25 浙江成功软件开发有限公司 Task issuing method based on artificial intelligence semantic analysis
CN109871532B (en) * 2019-01-04 2022-07-08 平安科技(深圳)有限公司 Text theme extraction method and device and storage medium
CN110222330B (en) * 2019-04-26 2024-01-30 平安科技(深圳)有限公司 Semantic recognition method and device, storage medium and computer equipment
CN110399970B (en) * 2019-05-05 2021-10-01 首都经济贸易大学 Wavelet convolution wavelet neural network and information analysis method and system
CN110196976B (en) * 2019-05-10 2020-10-16 新华三大数据技术有限公司 Text emotional tendency classification method and device and server
CN110457677B (en) * 2019-06-26 2023-11-17 平安科技(深圳)有限公司 Entity relationship identification method and device, storage medium and computer equipment
CN110377756B (en) * 2019-07-04 2020-03-17 成都迪普曼林信息技术有限公司 Method for extracting event relation of mass data set
CN110427615B (en) * 2019-07-17 2022-11-22 宁波深擎信息科技有限公司 Method for analyzing modification tense of financial event based on attention mechanism
CN110598203B (en) * 2019-07-19 2023-08-01 中国人民解放军国防科技大学 Method and device for extracting entity information of military design document combined with dictionary
CN112307170A (en) * 2020-10-29 2021-02-02 首都师范大学 Relation extraction model training method, relation extraction method, device and medium
CN112036173A (en) * 2020-11-09 2020-12-04 北京读我科技有限公司 Method and system for processing telemarketing text
CN112818683A (en) * 2021-01-26 2021-05-18 山西三友和智慧信息技术股份有限公司 Chinese character relationship extraction method based on trigger word rule and Attention-BilSTM

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202054B (en) * 2016-07-25 2018-12-14 哈尔滨工业大学 A kind of name entity recognition method towards medical field based on deep learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106354710A (en) * 2016-08-18 2017-01-25 清华大学 Neural network relation extracting method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GREFF, K.: "LSTM: A search space odyssey ", vol. 1, arxiv: 1503.04069, 13 March 2015 (2015-03-13), XP055353064, DOI: 10.1109/TNNLS.2016.2582924 *
HUANG, JIYANG: "Chinese word segmentation analysis based on bidirectional LSTMN recurrent neural network", CHINA MASTER'S THESES FULL-TEXT DATABASE, no. 10, 15 October 2016 (2016-10-15) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111640424A (en) * 2019-03-01 2020-09-08 北京搜狗科技发展有限公司 Voice recognition method and device and electronic equipment
CN111640424B (en) * 2019-03-01 2024-02-13 北京搜狗科技发展有限公司 Voice recognition method and device and electronic equipment
US10885386B1 (en) 2019-09-16 2021-01-05 The Boeing Company Systems and methods for automatically generating training image sets for an object
US11113570B2 (en) 2019-09-16 2021-09-07 The Boeing Company Systems and methods for automatically generating training image sets for an environment
CN111724876A (en) * 2020-07-21 2020-09-29 四川大学华西医院 System and method for drug delivery and guidance
CN111724876B (en) * 2020-07-21 2023-03-24 四川大学华西医院 System and method for drug delivery and guidance

Also Published As

Publication number Publication date
CN107239446B (en) 2019-12-03
CN107239446A (en) 2017-10-10

Similar Documents

Publication Publication Date Title
WO2018218707A1 (en) Neural network and attention mechanism-based information relation extraction method
CN108563653B (en) Method and system for constructing knowledge acquisition model in knowledge graph
CN106407333B (en) Spoken language query identification method and device based on artificial intelligence
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN114064918B (en) Multi-modal event knowledge graph construction method
CN112347268A (en) Text-enhanced knowledge graph joint representation learning method and device
CN112487143A (en) Public opinion big data analysis-based multi-label text classification method
CN111581395A (en) Model fusion triple representation learning system and method based on deep learning
CN110851566B (en) Differentiable network structure searching method applied to named entity recognition
CN111008293A (en) Visual question-answering method based on structured semantic representation
CN111340661B (en) Automatic application problem solving method based on graph neural network
Al Smadi et al. Artificial intelligence for speech recognition based on neural networks
CN109165275B (en) Intelligent substation operation ticket information intelligent search matching method based on deep learning
CN113673254B (en) Knowledge distillation position detection method based on similarity maintenance
CN112417884A (en) Sentence semantic relevance judging method based on knowledge enhancement and knowledge migration
CN112597316A (en) Interpretable reasoning question-answering method and device
CN111061951A (en) Recommendation model based on double-layer self-attention comment modeling
CN116484024A (en) Multi-level knowledge base construction method based on knowledge graph
CN113779988A (en) Method for extracting process knowledge events in communication field
CN111339258B (en) University computer basic exercise recommendation method based on knowledge graph
CN116521872A (en) Combined recognition method and system for cognition and emotion and electronic equipment
CN114021658A (en) Training method, application method and system of named entity recognition model
CN114692615A (en) Small sample semantic graph recognition method for small languages
CN113111136A (en) Entity disambiguation method and device based on UCL knowledge space
CN113239143A (en) Power transmission and transformation equipment fault processing method and system fusing power grid fault case base

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17912327

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17912327

Country of ref document: EP

Kind code of ref document: A1