CN110765755A - A Semantic Similarity Feature Extraction Method Based on Double Selection Gate - Google Patents

A Semantic Similarity Feature Extraction Method Based on Double Selection Gate Download PDF

Info

Publication number
CN110765755A
CN110765755A CN201911032492.1A CN201911032492A CN110765755A CN 110765755 A CN110765755 A CN 110765755A CN 201911032492 A CN201911032492 A CN 201911032492A CN 110765755 A CN110765755 A CN 110765755A
Authority
CN
China
Prior art keywords
sentence
vector
selection gate
matching
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911032492.1A
Other languages
Chinese (zh)
Inventor
蔡晓东
秦菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN201911032492.1A priority Critical patent/CN110765755A/en
Publication of CN110765755A publication Critical patent/CN110765755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种基于双重选择门的语义相似度特征提取方法,涉及自然语言处理领域,技术方案为,首先对输入句子对进行分词并且进行向量化表示得到词向量,将得到的词向量序列输入双向长短时记忆网络中,得到两条句子的上下文信息向量,其次通过双重选择门分别获得句子对的核心特征向量,然后将向量输入多角度语义特征匹配网络,得到句子对的特征匹配向量,最后,将匹配向量分别通过双向长短时记忆网络聚合层合并两个语义特征匹配向量,并进行句子对的相似性预测。本方法有效缓解了信息冗余导致匹配效率低的问题,同时又避免了人工提取核心信息的成本问题。

Figure 201911032492

The invention discloses a semantic similarity feature extraction method based on double selection gate, which relates to the field of natural language processing. Input the two-way long-term memory network to obtain the context information vector of the two sentences, and then obtain the core feature vector of the sentence pair through the double selection gate, and then input the vector into the multi-angle semantic feature matching network to obtain the sentence pair. Feature matching vector, Finally, the matching vectors are combined with two semantic feature matching vectors through the bidirectional long-short-term memory network aggregation layer respectively, and the similarity prediction of sentence pairs is performed. The method effectively alleviates the problem of low matching efficiency caused by information redundancy, and at the same time avoids the cost problem of manually extracting core information.

Figure 201911032492

Description

一种基于双重选择门的语义相似度特征提取方法A Semantic Similarity Feature Extraction Method Based on Double Selection Gate

技术领域technical field

本发明涉及自然语言处理领域,特别涉及一种基于双重选择门的语义相似度特征提取方法。The invention relates to the field of natural language processing, in particular to a method for extracting semantic similarity features based on a double selection gate.

背景技术Background technique

当今世界充斥着海量的信息,这些信息大部分都是以文本的形式保存起来的,而人工智能一个重要的课题就是将这些文本信息整理后“表达”出来,使计算机能像人类一样“理解”这些信息。由于语言中存在很多一个词语有多种意思,相同的概念可以采用不同的方式进行表述等较多不确定因素的存在,传统基于字符串匹配的文本相似度计算方法在搜索引擎以及问答系统中等,已经难以满足用户需求,当用户输入关键字寻找与关键字匹配的信息时,搜索反馈回来的内容可能对应着不符合的内容,有可能只是少数内容符合搜索的关键字,这给用户带来了极度的不变,所以更深层次的语义理解计算文本相似的成为当前自然语言研究的热点。Today's world is full of massive amounts of information, most of which are stored in the form of text, and an important topic of artificial intelligence is to organize and "express" this text information, so that computers can "understand" like humans. these messages. Due to the existence of many uncertain factors in the language, such as many words with multiple meanings, the same concept can be expressed in different ways, etc., the traditional text similarity calculation method based on string matching is used in search engines and question answering systems, etc. It has been difficult to meet user needs. When a user enters a keyword to find information that matches the keyword, the content returned by the search may correspond to the content that does not match the search. Extremely invariant, so deeper semantic understanding computing text similarity has become a hotspot in current natural language research.

现有技术中句子语义相似度匹配方法很多,最开始基本上都集中在字符串的匹配上,其基本的流程通常分为两步,首先将两个要判断相似度的句子输入到循环网络中映射成向量表示,然后将所得到的两个句子向量通过余弦距离判断两个句子的相似程度。虽然采用传统的字符串法来判断句子对的相似性在一定程度上帮助人们在搜寻相关问题时过滤掉了一些无关信息,但搜索结果在质量上还是不能令人满意。因为通过字符串判断句子之间的相似程度仅仅是在字词层面计算词之间的距离,没有上下文语义信息,导致信息错误匹配、有歧义,最终用户不能快速找到关键字的相关信息。There are many sentence semantic similarity matching methods in the prior art, which basically focus on the matching of strings at first. Map to a vector representation, and then use the obtained two sentence vectors to judge the similarity of the two sentences through the cosine distance. Although using the traditional string method to judge the similarity of sentence pairs helps people to filter out some irrelevant information when searching for related questions to a certain extent, the quality of the search results is still unsatisfactory. Because judging the similarity between sentences by strings is only to calculate the distance between words at the word level, without contextual semantic information, resulting in incorrect matching and ambiguity of information, and end users cannot quickly find relevant information about keywords.

因此,有必要发明一种新的语义相似度特征提取方法。Therefore, it is necessary to invent a new semantic similarity feature extraction method.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种基于双重选择门的语义相似度特征提取方法,其能够自动判定两条句子的语义相似度,并且通过双重自动选择核心信息有效减少了句子冗余信息,提高了句子相似度的准确率和判定效率。The purpose of the present invention is to provide a semantic similarity feature extraction method based on a double selection gate, which can automatically determine the semantic similarity of two sentences, and effectively reduces the redundant information of sentences through double automatic selection of core information, and improves the performance of sentences. Similarity accuracy and judgment efficiency.

其技术方案为:Its technical solutions are:

S100、将待处理的句子对P和Q的进行分词处理,对经过分词处理后的词语进行向量化表示得到词向量;S100, performing word segmentation processing on the sentence pairs P and Q to be processed, and performing vectorized representation on the words after word segmentation processing to obtain word vectors;

S200、将步骤S100中得到的句子对P和Q的全部词向量按顺序输入第一循环神经网络,得到上下文信息向量,其中,句子的最后一个上下文信息向量代表该句子的句向量;S200. Input all word vectors of the sentence pairs P and Q obtained in step S100 into the first cyclic neural network in order to obtain a context information vector, wherein the last context information vector of the sentence represents the sentence vector of the sentence;

S300、将句子对P和Q的句向量输入到一级选择门中,获取核心信息特征;S300, input the sentence vectors of the sentence pairs P and Q into the first-level selection gate to obtain core information features;

S400、将步骤S300中得到的核心信息输入到二级选择门中,再次获取核心信息特征;S400, input the core information obtained in step S300 into the secondary selection gate, and obtain the core information feature again;

S500、将步骤S400获取到的核心信息输入到多角度语义匹配网络,其中,多角度语义匹配网络包含全匹配、最大池化匹配、注意力匹配和最大注意力匹配四种方式,得到句子对的特征匹配向量;S500. Input the core information obtained in step S400 into the multi-angle semantic matching network, wherein the multi-angle semantic matching network includes four methods: full matching, maximum pooling matching, attention matching and maximum attention matching, to obtain sentence pairs feature matching vector;

S600、将步骤S500得到的匹配向量通过第二神经网络,使特征匹配向量融合成一个固定长度的向量,并输入到预测层计算句子对的相似度概率分布。S600. Pass the matching vector obtained in step S500 through the second neural network to fuse the feature matching vector into a fixed-length vector, and input it to the prediction layer to calculate the probability distribution of the similarity of sentence pairs.

优选为,所述第一循环神经网络,用于生成上下文信息的状态向量。Preferably, the first recurrent neural network is used to generate a state vector of context information.

优选为,所述第一循环神经网络第一层为单项长短时记忆网络,第二层为双向长短时记忆网络,每个层级结构均包括多个相连的LSTM细胞模块。Preferably, the first layer of the first recurrent neural network is a single-term long-short-term memory network, the second layer is a bidirectional long-short-term memory network, and each hierarchical structure includes a plurality of connected LSTM cell modules.

优选为,所述第一循环神经网络包括两个层级结构;Preferably, the first recurrent neural network includes two hierarchical structures;

所述第一循环神经网络的第一层用于生成字词级别的向量;The first layer of the first recurrent neural network is used to generate word-level vectors;

所述第一循环神经网络的第二层用于生成上下文信息向量。The second layer of the first recurrent neural network is used to generate context information vectors.

优选为,所述一级选择门和二级选择门分别包括多个一级选择门单元和二级选择门单元;Preferably, the primary selection gate and the secondary selection gate respectively comprise a plurality of primary selection gate units and secondary selection gate units;

所述一级选择门和二级选择门的结构不同,参数不同。The first-level selection gate and the second-level selection gate have different structures and different parameters.

优选为,所述步骤S200中,将步骤S100得到的句子对的全部词向量按顺序输入第一循环网络,从而得到输入每个词后的句子状态向量,具体为:Preferably, in the step S200, all the word vectors of the sentence pair obtained in the step S100 are input into the first cyclic network in order, so as to obtain the sentence state vector after each word is input, specifically:

将第i个所述词向量和第i-1时刻的输出词向量输入到第i个所述LSTM细胞模块中,经过第i个所述LSTM细胞模块处理得到第i个词向量后句子的状态向量。Input the i-th word vector and the output word vector at the i-1th time into the i-th LSTM cell module, and obtain the state of the sentence after the i-th word vector is processed by the i-th LSTM cell module vector.

优选为,所述步骤S300中将句子对的句向量输入到一级选择门中,获取核心信息特征包括:Preferably, in the step S300, the sentence vector of the sentence pair is input into the first-level selection gate, and the acquisition of core information features includes:

将句子P的每个时刻所述上下文信息向量和句子Q的第i个所述句向量输入到所述一级选择门单元中,经过第i个所述一级选择门单元处理得到核心信息。The context information vector at each moment of sentence P and the i-th sentence vector of sentence Q are input into the first-level selection gate unit, and the core information is obtained through processing by the i-th first-level selection gate unit.

优选为,步骤S400中将步骤S300中得到的核心信息输入到二级选择门中,再次获取核心信息特征包括:Preferably, in step S400, the core information obtained in step S300 is input into the secondary selection gate, and obtaining the core information features again includes:

将所述第i个一级选择门单元处理得到的核心信息输入到第i个二级选择门单元中,经过第i个二级选择门单元处理得到核心信息特征。The core information processed by the i-th first-level selection gate unit is input into the i-th second-level selection gate unit, and the core information features are obtained through processing by the i-th second-level selection gate unit.

优选为,所述步骤S500中,将步骤S400获取到的核心信息输入到多角度语义匹配网络中,得到特征匹配向量包括:Preferably, in the step S500, the core information obtained in the step S400 is input into the multi-angle semantic matching network, and the obtained feature matching vector includes:

所述全匹配将句子P每个时刻所述上下文信息向量与句子Q所述句向量进行余弦相似计算,得到特征匹配向量;The full matching performs cosine similarity calculation on the context information vector described in sentence P and the sentence vector described in sentence Q at each moment to obtain a feature matching vector;

所述最大池化匹配将句子P每个时刻所述上下文信息向量与句子Q每个时刻所述上下文信息向量进行余弦相似计算,选取最大值作为特征匹配向量;The maximum pooling matching performs cosine similarity calculation on the context information vector at every moment of sentence P and the context information vector at every moment of sentence Q, and selects the maximum value as a feature matching vector;

所述注意力匹配将句子P第i时刻的所述上下文信息向量与句子Q第i时刻所述上下文信息向量分别进行余弦计算,得到句子P的i个余弦值,将i个余弦值加权作为注意力权重并与句子Q每个时刻所述上下文信息相乘,得到的结果再与句子P每个时刻所述的上下文信息向量进行余弦计算,得到特征匹配向量;The attention matching performs cosine calculation on the context information vector at the ith moment of sentence P and the context information vector at the ith moment of sentence Q, respectively, to obtain i cosine values of sentence P, and weight the i cosine values as attention. The force weight is multiplied with the context information described in each moment of sentence Q, and the obtained result is then subjected to cosine calculation with the context information vector described in each moment of sentence P to obtain a feature matching vector;

所述最大注意力匹配将句子P第i时刻的所述上下文信息向量与句子Q第i时刻所述的上下文信息向量分别进行余弦计算,得到句子P的i个余弦值,从i个余弦值中选取最大的值作为注意力权重,并与句子Q的所述上下文信息相乘,得到的结果再与句子P每个时刻所述的上下文信息向量进行余弦计算,得到特征匹配向量。The maximum attention matching performs cosine calculation on the context information vector at the i-th moment of sentence P and the context information vector at the i-th moment in sentence Q, respectively, to obtain i cosine values of sentence P, from the i cosine values. The largest value is selected as the attention weight and multiplied by the context information of sentence Q, and the result obtained is then cosine calculation with the context information vector described in sentence P at each moment to obtain the feature matching vector.

优选为,所述第二神经网络包括两个双向长短时记忆网络,用于处理句子对的特征匹配向量聚合成一个固定长度的向量。Preferably, the second neural network includes two bidirectional long-term and short-term memory networks, and the feature matching vectors for processing sentence pairs are aggregated into a fixed-length vector.

优选为,所述步骤S600将S500步骤得到的匹配向量通过第二神经网络,使特征匹配向量融合成一个固定长度的向量,并输入到预测层计算句子对的相似度概率分布包括:Preferably, in the step S600, the matching vector obtained in the step S500 is passed through the second neural network, so that the feature matching vector is fused into a fixed-length vector, and input to the prediction layer to calculate the similarity probability distribution of the sentence pair, including:

将句子P经过四个匹配得到的四个特征匹配向量,经过所述第二循环神经网络聚合成一个固定长度的特征匹配向量;The four feature matching vectors obtained by the four matchings of the sentence P are aggregated into a fixed-length feature matching vector through the second cyclic neural network;

将句子Q也经过四个匹配得到的四个特征匹配向量,经过所述的双向长短时记忆网络聚合成一个固定长度的特征匹配向量;The four feature matching vectors obtained by the four matchings of the sentence Q are aggregated into a fixed-length feature matching vector through the bidirectional long-short-term memory network;

利用句子P和句子Q两个特征匹配向量输入到预测层,得到句子对相似度。The two feature matching vectors of sentence P and sentence Q are input to the prediction layer to obtain the sentence pair similarity.

优选为,步骤S100中采用Word2Vec对所述经过Jieba分词处理后的词语进行向量化表示。Word2Vec是一种预测模型,可以高效地学习嵌入字,Word2Vec的基本思想是把自然语言中的每一个词,表示成一个统一意义统一维度的短向量。Preferably, in step S100, Word2Vec is used to perform vectorized representation on the words processed by Jieba word segmentation. Word2Vec is a predictive model that can efficiently learn embedded words. The basic idea of Word2Vec is to represent each word in natural language as a short vector with a unified meaning and dimension.

本发明实施例提供的技术方案带来的有益效果是:The beneficial effects brought by the technical solutions provided in the embodiments of the present invention are:

1、本发明的基于双重选择门的语义相似度特征提取方法,无需依赖人工去除冗余信息,自动获取句子中的核心信息,通过语义相似度模型能够自动判定两条句子的语义相似性,并且用该模型判定的句子相似性准确率和效率更高,能够帮助用户在问答或者搜索系统中找到更匹配的结果。1. The semantic similarity feature extraction method based on the double selection gate of the present invention does not need to rely on manual removal of redundant information, automatically obtains the core information in the sentence, and can automatically determine the semantic similarity of two sentences through the semantic similarity model, and The sentence similarity determined by this model is more accurate and efficient, and can help users find more matching results in question answering or search systems.

2、本发明的基于双重选择门的语义相似度特征提取方法,利用双向长短时记忆网络对句子进行上下文信息向量化表示。该网络拥有细胞状态能够捕获文本的长距离依赖关系,可以记住长期状态,实现信息的更新、遗忘、过滤,更好表达上下文关系,并且可以解决网络梯度消失和爆炸问题。传统的RNN网络将过去的输出和当前的输入连接在一起通过激活函数控制两者输出,只能考虑最近时刻的状态。2. The semantic similarity feature extraction method based on the double selection gate of the present invention utilizes a bidirectional long-short-term memory network to perform vectorized representation of context information on sentences. The network has a cell state that can capture long-distance dependencies of text, can remember long-term states, realize information update, forgetting, filtering, better express contextual relationships, and can solve the problem of network gradient disappearance and explosion. The traditional RNN network connects the past output and the current input to control the output of both through the activation function, and can only consider the state of the recent moment.

3、本发明的基于双重选择门的语义相似度特征提取方法,利用两个选择门自动获取句子中的核心语义信息,从而避免了冗余信息对句子语义相似度判定的影响,并且提高了匹配效率。3. The semantic similarity feature extraction method based on the double selection gate of the present invention uses two selection gates to automatically obtain the core semantic information in the sentence, thereby avoiding the influence of redundant information on the judgment of the semantic similarity of the sentence, and improving the matching. efficiency.

4、本发明的基于双重选择门的语义相似度特征提取方法,利用多角度语义匹配网络,对两条句子进行全匹配、最大池化匹配、注意力匹配和最大注意力匹配四种匹配方式,四种匹配方式充分利用上下文信息向量进行多角度更细致的匹配,有效避免了在传统方法中只通过两条句子字词之间的余弦距离判定相似度准确率低的问题,并采用双向长短时记忆网络将匹配向量融合城固定长度向量,有效的控制了匹配向量的维度,有利于预测层计算句子对的相似度。4. The semantic similarity feature extraction method based on the double selection gate of the present invention uses a multi-angle semantic matching network to perform four matching methods of full matching, maximum pooling matching, attention matching and maximum attention matching for two sentences, The four matching methods make full use of the context information vector to perform multi-angle and more detailed matching, which effectively avoids the problem of low accuracy in determining the similarity by only the cosine distance between two sentences in the traditional method. The memory network integrates the matching vector into a fixed-length vector, which effectively controls the dimension of the matching vector, which is beneficial for the prediction layer to calculate the similarity of sentence pairs.

5、本发明的基于双重选择门的语义相似度特征提取方法,能够有效提高句子语义相似度的判定准确率和效率,适用于中文和英文句子对语料。5. The semantic similarity feature extraction method based on the double selection gate of the present invention can effectively improve the accuracy and efficiency of sentence semantic similarity judgment, and is suitable for Chinese and English sentence pair corpus.

附图说明Description of drawings

图1为本发明实施例的方法流程图。FIG. 1 is a flowchart of a method according to an embodiment of the present invention.

图2为本发明实施例的双重选择门模块的结构图。FIG. 2 is a structural diagram of a dual selection gate module according to an embodiment of the present invention.

图3为本发明实施例的多角度语义匹配网络结构图。FIG. 3 is a structural diagram of a multi-angle semantic matching network according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。当然,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Of course, the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

需要说明的是,在不冲突的情况下,本发明创造中的实施例及实施例中的特征可以相互组合。It should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other under the condition of no conflict.

实施例1Example 1

参见图1,本发明提供一种基于双重选择门的语义相似度特征提取方法,包括:Referring to Fig. 1, the present invention provides a method for extracting semantic similarity features based on double selection gate, including:

S100、将待处理的句子对P和Q的进行分词处理,对经过分词处理后的词语进行向量化表示得到词向量。S100: Perform word segmentation processing on the sentence pairs P and Q to be processed, and perform vectorized representation on the words after word segmentation processing to obtain word vectors.

步骤S100中的分词处理是将句子中的词语切分成合理的、符合语境意义的词语序列的过程,它是自然语言理解和文本信息处理的关键技术和难点之一,也是语义相似度模型中的一个重要处理环节。中文的词语切分问题比较复杂,其原因在于词语之间没有明显的标记,词语的使用灵活、变化多样、语义丰富,容易产生歧义。据研究,基于统计的中文文本分词的主要难点在于歧义消解、固有名词和新词发现,本发明采用Jieba对中文文本进行分词,采用Nltk对英文文本进行分词,从而提高分词正确率。The word segmentation process in step S100 is the process of dividing the words in the sentence into reasonable word sequences that conform to the contextual meaning. It is one of the key technologies and difficulties in natural language understanding and text information processing. an important part of the processing. The problem of Chinese word segmentation is more complicated. The reason is that there is no obvious mark between words, and the use of words is flexible, varied, and semantically rich, which is prone to ambiguity. According to research, the main difficulty of Chinese text segmentation based on statistics lies in ambiguity resolution, proper noun and new word discovery. The present invention uses Jieba to segment Chinese text and Nltk to segment English text, thereby improving the correct rate of word segmentation.

对单词进行向量化表示的模型有One-hot模型和Distributed模型。其中,One-hot模型简单,但是维度无法控制,并且无法很好的表示词与词之间的关系,因此,本方法采用Distributed模型,具体采用Word2Vec对单词进行向量化表示。The models for vectorized representation of words include One-hot model and Distributed model. Among them, the One-hot model is simple, but the dimensions cannot be controlled, and the relationship between words cannot be well represented. Therefore, this method adopts the Distributed model, and specifically uses Word2Vec to vectorize the words.

S200、将步骤S100中得到的句子对P和Q的全部词向量按顺序输入第一循环神经网络,得到上下文信息向量,其中,句子的最后一个上下文信息向量代表该句子的句向量;S200. Input all word vectors of the sentence pairs P and Q obtained in step S100 into the first cyclic neural network in order to obtain a context information vector, wherein the last context information vector of the sentence represents the sentence vector of the sentence;

其中,第一循环神经网络,用于生成上下文信息的状态向量;第一循环神经网络包括两个层级结构,第一层为单项长短时记忆网络,用于生成字词级别的向量;第二层为双向长短时记忆网络,用于生成上下文信息向量;每个层级结构均包括多个相连的LSTM细胞模块;处于不同层级结构的模块参数不同,以便生成单词级别和上下文信息向量。Among them, the first recurrent neural network is used to generate the state vector of context information; the first recurrent neural network includes two hierarchical structures, the first layer is a single-term long-term memory network, which is used to generate word-level vectors; the second layer is a bidirectional long-short-term memory network for generating context information vectors; each hierarchical structure includes multiple connected LSTM cell modules; modules in different hierarchical structures have different parameters to generate word-level and context information vectors.

将步骤S100得到的句子对的全部词向量按顺序输入第一循环网络,从而得到输入每个词后的句子状态向量,具体为:Input all the word vectors of the sentence pair obtained in step S100 into the first cyclic network in sequence, so as to obtain the sentence state vector after each word is input, specifically:

将第i个词向量和第i-1时刻的输出词向量输入到第i个LSTM细胞模块中,经过第i个LSTM细胞模块处理得到第i个词向量后句子的状态向量。Input the i-th word vector and the output word vector at the i-1th time into the i-th LSTM cell module, and obtain the state vector of the sentence after the i-th word vector after processing by the i-th LSTM cell module.

S300、将句子对P和Q的句向量输入到一级选择门中,获取核心信息特征;S300, input the sentence vectors of the sentence pairs P and Q into the first-level selection gate to obtain core information features;

具体为,将句子P的每个时刻上下文信息向量和句子Q的第i个句向量输入到一级选择门单元中,经过第i个一级选择门单元处理得到核心信息。Specifically, the context information vector of sentence P at each moment and the ith sentence vector of sentence Q are input into the first-level selection gate unit, and the core information is obtained after processing by the i-th first-level selection gate unit.

S400、将步骤S300中得到的核心信息输入到二级选择门中,再次获取核心信息特征;具体为,将第i个一级选择门单元处理得到的核心信息输入到第i个二级选择门单元中,经过第i个二级选择门单元处理得到核心信息特征。S400. Input the core information obtained in step S300 into the secondary selection gate, and obtain the core information feature again; specifically, input the core information processed by the i-th primary selection gate unit into the i-th secondary selection gate In the unit, the core information features are obtained through processing by the i-th secondary selection gate unit.

一级选择门和二级选择门分别包括多个一级选择门单元和二级选择门单元;The primary selection gate and the secondary selection gate respectively comprise a plurality of primary selection gate units and secondary selection gate units;

一级选择门和二级选择门的结构不同,参数不同。The structure of the primary selection gate and the secondary selection gate are different, and the parameters are different.

S500、将步骤S400获取到的核心信息输入到多角度语义匹配网络,其中,多角度语义匹配网络包含全匹配、最大池化匹配、注意力匹配和最大注意力匹配四种方式,得到句子对的特征匹配向量;具体为,S500. Input the core information obtained in step S400 into the multi-angle semantic matching network, wherein the multi-angle semantic matching network includes four methods: full matching, maximum pooling matching, attention matching and maximum attention matching, to obtain sentence pairs Feature matching vector; specifically,

全匹配将句子P每个时刻上下文信息向量与句子Q句向量进行余弦相似计算,得到特征匹配向量;In full matching, the cosine similarity calculation is performed between the context information vector of sentence P and the sentence vector of sentence Q at each moment, and the feature matching vector is obtained;

最大池化匹配将句子P每个时刻上下文信息向量与句子Q每个时刻上下文信息向量进行余弦相似计算,选取最大值作为特征匹配向量;The maximum pooling matching performs cosine similarity calculation on the context information vector of sentence P at each moment and the context information vector of sentence Q at each moment, and selects the maximum value as the feature matching vector;

注意力匹配将句子P第i时刻的上下文信息向量与句子Q第i时刻上下文信息向量分别进行余弦计算,得到句子P的i个余弦值,将i个余弦值加权作为注意力权重并与句子Q每个时刻上下文信息相乘,得到的结果再与句子P每个时刻的上下文信息向量进行余弦计算,得到特征匹配向量;Attention matching performs cosine calculation on the context information vector of sentence P at the ith moment and the context information vector of sentence Q at the ith moment, respectively, to obtain i cosine values of sentence P, and weight the i cosine values as the attention weight and combine it with sentence Q. The context information at each moment is multiplied, and the result obtained is then subjected to cosine calculation with the context information vector of sentence P at each moment to obtain a feature matching vector;

最大注意力匹配将句子P第i时刻的上下文信息向量与句子Q第i时刻的上下文信息向量分别进行余弦计算,得到句子P的i个余弦值,从i个余弦值中选取最大的值作为注意力权重,并与句子Q的上下文信息相乘,得到的结果再与句子P每个时刻的上下文信息向量进行余弦计算,得到特征匹配向量。The maximum attention matching performs cosine calculation on the context information vector at the ith moment of sentence P and the context information vector at the ith moment of sentence Q, respectively, to obtain i cosine values of sentence P, and selects the largest value from the i cosine values as attention. The force weight is multiplied with the context information of sentence Q, and the result obtained is then subjected to cosine calculation with the context information vector of sentence P at each moment to obtain the feature matching vector.

其中,第二神经网络包括两个双向长短时记忆网络,用于处理句子对的特征匹配向量聚合成一个固定长度的向量。Among them, the second neural network includes two bidirectional long and short-term memory networks, and the feature matching vectors used to process sentence pairs are aggregated into a fixed-length vector.

S600、将步骤S500得到的匹配向量通过第二神经网络,使特征匹配向量融合成一个固定长度的向量,并输入到预测层计算句子对的相似度概率分布,具体为,S600. Pass the matching vector obtained in step S500 through the second neural network, so that the feature matching vector is fused into a fixed-length vector, and input to the prediction layer to calculate the similarity probability distribution of sentence pairs, specifically,

将句子P经过四个匹配得到的四个特征匹配向量,经过第二循环神经网络聚合成一个固定长度的特征匹配向量;The four feature matching vectors obtained by the four matchings of the sentence P are aggregated into a fixed-length feature matching vector through the second cyclic neural network;

将句子Q也经过四个匹配得到的四个特征匹配向量,经过的双向长短时记忆网络聚合成一个固定长度的特征匹配向量;The four feature matching vectors obtained by the four matchings of the sentence Q are also aggregated into a fixed-length feature matching vector through the bidirectional long-short-term memory network;

利用句子P和句子Q两个特征匹配向量输入到预测层,得到句子对相似度。The two feature matching vectors of sentence P and sentence Q are input to the prediction layer to obtain the sentence pair similarity.

步骤S100中采用Word2Vec对经过Jieba分词处理后的词语进行向量化表示。In step S100, Word2Vec is used to vectorize the words processed by Jieba word segmentation.

实施例2Example 2

在实施例1的基础上,第一循环神经网络由一层单向LSTM网络构成和一层双向LSTM网络构成,每个层级包括多个相连的LSTM细胞模块,根据LSTM细胞模块中的输入门、遗忘门、更新门和过滤输出门对当前输入信息和前一时刻输出信息进行处理。第一循环神经网络的第一层包括多个相连的单向LSTM细胞模块,用于得到每个词的状态向量。第一循环神经网络的第二层包括多个相连的双向LSTM细胞模块,用于的到句子上下文信息向量。On the basis of Example 1, the first recurrent neural network consists of a layer of unidirectional LSTM network and a layer of bidirectional LSTM network, each layer includes a plurality of connected LSTM cell modules, according to the input gate in the LSTM cell module, The forget gate, update gate and filter output gate process the current input information and the output information of the previous moment. The first layer of the first recurrent neural network consists of multiple connected unidirectional LSTM cell modules for obtaining the state vector for each word. The second layer of the first recurrent neural network consists of multiple connected bidirectional LSTM cell modules for sentence context information vectors.

在本方法中,首先通过第一循环神经网络对句子的词语和上下文信息进行建模,得到句子每个词对应时刻的状态向量和每个时刻句子的上下文信息向量。其中,如图2所示,步骤S200中第一循环神经网络中采用长短时记忆网络(Long Short Term MemoryNetwork,LSTM)该网络的计算公式如下:In this method, the first cyclic neural network is used to model the words and context information of the sentence, and the state vector of the corresponding moment of each word of the sentence and the context information vector of the sentence at each moment are obtained. Wherein, as shown in Figure 2, in step S200, the first cyclic neural network adopts the Long Short Term Memory Network (LSTM), and the calculation formula of the network is as follows:

ft=σ(Wfwt+Ufht-1+bf);f t =σ(W f w t +U f h t-1 +b f );

it=σ(Wiwt+Uiht-1+bi);i t =σ(W i w t +U i h t-1 +b i );

ot=σ(Wowt+Uoht-1+bo);o t =σ(W o w t +U o h t-1 +b o );

Figure BDA0002250550770000071
Figure BDA0002250550770000071

Figure BDA0002250550770000072
Figure BDA0002250550770000072

ht=ottanh(ct);h t =o t tanh(c t );

上述公式中ft为遗忘门的输出;it为输入门的输出;ot为输出门的输出;Wf、Wi、Wo、Wc、bf、bi、bo、bc、为遗忘门、输入门、输出门、选择门的权重矩阵和偏置向量;为新的记忆信息;ct为更新的LSTM网络单元的记忆内容;σ为sigmoid函数;⊙为元素乘积;ht-1为t-1时刻的隐藏层输出,Wt为t时刻的输入信息。In the above formula, f t is the output of the forget gate; i t is the output of the input gate; o t is the output of the output gate; W f , Wi , W o , W c , b f , b i , b o , b c , are the weight matrix and bias vector of the forget gate, input gate, output gate, and selection gate; is the new memory information; c t is the memory content of the updated LSTM network unit; σ is the sigmoid function; ⊙ is the element product; h t-1 is the output of the hidden layer at time t-1, and W t is the input information at time t .

在本发明的方法中,由于通过循环神经网络对句子上下文进行建模,使得t时刻输入单词后对应句子的状态向量理论上包含了该时刻之前的所有单词的信息,也就是说,输入最后一个词后得到的句子状态向量hn包含了整个句子的所有信息,因此,hn代表了整个句子的状态向量,即句向量。In the method of the present invention, since the context of the sentence is modeled by the cyclic neural network, the state vector of the corresponding sentence after the word is input at time t theoretically includes the information of all words before the time, that is, the input of the last word The sentence state vector h n obtained after the word contains all the information of the whole sentence. Therefore, h n represents the state vector of the whole sentence, that is, the sentence vector.

实施例3Example 3

在实施例1或2的基础上,双重选择门包括两个选择门结构,两个选择门结构不同,参数也不同。通过不同的选择门,有利于过滤掉句子中的冗余信息,更加准确地获取核心信息。第一层选择门计算公式如下:On the basis of Embodiment 1 or 2, the double selection gate includes two selection gate structures, and the two selection gates have different structures and different parameters. Through different selection gates, it is beneficial to filter out redundant information in sentences and obtain core information more accurately. The calculation formula of the first layer selection gate is as follows:

s=hns= hn ;

sGatei=σ(Wshi+Uss+b);sGate i =σ(W s hi +U s s+b);

Figure BDA0002250550770000081
Figure BDA0002250550770000081

上述公式中,使用句子上下文隐向量构造其句向量,取句子的隐藏层hn为句向量s,sGatei为门向量,Ws和Us是权重矩阵,b是偏置向量,σ是sigmoid激活函数,

Figure BDA0002250550770000082
是元素之间的点乘。In the above formula, use the sentence context hidden vector to construct its sentence vector, take the hidden layer h n of the sentence as the sentence vector s, sGate i as the gate vector, W s and U s are the weight matrix, b is the bias vector, σ is the sigmoid activation function,
Figure BDA0002250550770000082
is the dot product between elements.

第二层选择门通过计算t时刻的上下文向量,利用前一时刻句向量和选择门隐层状态h′i计算选择门权重,最后将选择门权重归一化,计算公式如下:The second-layer selection gate calculates the selection gate weight by calculating the context vector at time t, using the sentence vector at the previous time and the hidden layer state h′ i of the selection gate, and finally normalizes the selection gate weight. The calculation formula is as follows:

ei,j=va Ttanh(Wast-1+Uah'i);e i,j = v a T tanh(W a s t-1 +U a h' i );

Figure BDA0002250550770000083
Figure BDA0002250550770000083

Figure BDA0002250550770000084
Figure BDA0002250550770000084

上述公式中h′i为上下文隐向量;

Figure BDA0002250550770000085
为权值矩阵,ai,j为选择门选中归一化,
Figure BDA0002250550770000086
为第k个语句的核心特征向量,k=1,2,...,L,L为文本中的语句数量。In the above formula, h′ i is the context hidden vector;
Figure BDA0002250550770000085
is the weight matrix, a i,j is the normalization selected by the selection gate,
Figure BDA0002250550770000086
is the core feature vector of the kth sentence, k=1,2,...,L, where L is the number of sentences in the text.

参见图2,语句P为P=[p1,p2,...,pi,...,pn],语句Q表示为Q=[q1,q2,...,qi,...,qm]表示输入的句子对序列,模型一次输入词语并经过步骤S200得到句子的每个时刻的上下文信息向量表示,P语句上下文的隐向量表达式矩阵

Figure BDA0002250550770000087
和Q语句的上下文向量表达式矩阵
Figure BDA0002250550770000088
经过步骤S300、S400中的两层选择门获取核心信息,语句P核心特征特征表达式 同理可得,语句Q表达式
Figure BDA00022505507700000811
Referring to Fig. 2, the statement P is P=[p 1 , p 2 ,...,pi ,...,p n ], and the statement Q is expressed as Q=[q 1 ,q 2 ,..., qi ,...,q m ] represents the input sentence pair sequence, the model inputs the words once and obtains the context information vector representation of each moment of the sentence through step S200, and the implicit vector expression matrix of the P sentence context
Figure BDA0002250550770000087
and the context vector expression matrix of the Q statement
Figure BDA0002250550770000088
The core information is obtained through the two-layer selection gate in steps S300 and S400, and the expression of the core feature of the statement P is Similarly, the statement Q expression
Figure BDA00022505507700000811

本发明的方法通过循环神经网络得到的句子上下文信息向量,从而使两条句子的上下文语义关联性更强,更好的判断两条句子的语义相似度。The method of the invention obtains the sentence context information vector through the cyclic neural network, so that the context semantic correlation of the two sentences is stronger, and the semantic similarity of the two sentences is better judged.

如图3所示,第二循环神经网络为双向LSTM神经网络,包括多个双向LSTM细胞模块相连。为了使多角度匹配网络生成的特征匹配向量变成一个固定长度的向量输入到预测层,需要将匹配向量输入至双向LSTM网络中融合成一个固定长度的向量。As shown in Fig. 3, the second recurrent neural network is a bidirectional LSTM neural network, including multiple bidirectional LSTM cell modules connected. In order to make the feature matching vector generated by the multi-angle matching network into a fixed-length vector input to the prediction layer, the matching vector needs to be input into the bidirectional LSTM network and fused into a fixed-length vector.

本发明为得到两条语句的相似判定,使用了第二循环神经网络,将句子P和句子Q的四个特征匹配向量输入第二循环神经网络中融合得到一个固定长度向量,句子Q和句子P的四个特征匹配向量用以上相同操作,分别得到两个固定长度的匹配向量,将向量输入预测层得到句子对相似度概率分布。In order to obtain the similarity judgment of two sentences, the present invention uses a second cyclic neural network, and inputs the four feature matching vectors of sentence P and sentence Q into the second cyclic neural network to fuse to obtain a fixed-length vector, sentence Q and sentence P The four feature matching vectors of , use the same operations as above to obtain two fixed-length matching vectors respectively, and input the vectors into the prediction layer to obtain the probability distribution of sentence pair similarity.

利用本发明的方法判定的句子语义相似度,除了利用句子间的上下文信息之外,还自动从句子中提取了核心信息特征作为匹配网络的输入,提高了匹配准确率,同时减少了匹配网络对于冗余信息的处理,提高了匹配效率。对于句子中一些意思相同表达形式不同的词语,也可以通过模型判定它们相似,比如“计算机”和“电脑”两个词汇,在对两个词进行相似度判定时,不仅仅考虑词之间的距离,而是利用词所在句子上下文信息来判定相似度。The semantic similarity of sentences determined by the method of the present invention not only uses the context information between sentences, but also automatically extracts core information features from the sentences as the input of the matching network, which improves the matching accuracy and reduces the matching network for the same time. The processing of redundant information improves the matching efficiency. For some words with the same meaning and different expressions in the sentence, the model can also be used to determine that they are similar, such as the words "computer" and "computer". When determining the similarity of the two words, not only the similarity between the words is considered distance, but use the context information of the sentence in which the word is located to determine the similarity.

以上仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention. Inside.

Claims (10)

1.一种基于双重选择门的语义相似度特征提取方法,其特征在于,包括如下步骤1. a semantic similarity feature extraction method based on double selection gate, is characterized in that, comprises the steps S100、将待处理的句子对P和Q的进行分词处理,对经过分词处理后的词语进行向量化表示得到词向量;S100, performing word segmentation processing on the sentence pairs P and Q to be processed, and performing vectorized representation on the words after word segmentation processing to obtain word vectors; S200、将步骤S100中得到的句子对P和Q的全部词向量按顺序输入第一循环神经网络,得到上下文信息向量,其中,句子的最后一个上下文信息向量代表该句子的句向量;S200. Input all word vectors of the sentence pairs P and Q obtained in step S100 into the first cyclic neural network in order to obtain a context information vector, wherein the last context information vector of the sentence represents the sentence vector of the sentence; S300、将句子对P和Q的句向量输入到一级选择门中,获取核心信息特征;S300, input the sentence vectors of the sentence pairs P and Q into the first-level selection gate to obtain core information features; S400、将步骤S300中得到的核心信息输入到二级选择门中,再次获取核心信息特征;S400, input the core information obtained in step S300 into the secondary selection gate, and obtain the core information feature again; S500、将步骤S400获取到的核心信息输入到多角度语义匹配网络,其中,多角度语义匹配网络包含全匹配、最大池化匹配、注意力匹配和最大注意力匹配四种方式,得到句子对的特征匹配向量;S500. Input the core information obtained in step S400 into the multi-angle semantic matching network, wherein the multi-angle semantic matching network includes four methods: full matching, maximum pooling matching, attention matching and maximum attention matching, to obtain sentence pairs feature matching vector; S600、将步骤S500得到的匹配向量通过第二神经网络,使特征匹配向量融合成一个固定长度的向量,并输入到预测层计算句子对的相似度概率分布。S600. Pass the matching vector obtained in step S500 through the second neural network to fuse the feature matching vector into a fixed-length vector, and input it to the prediction layer to calculate the probability distribution of the similarity of sentence pairs. 2.根据权利要求1所述的基于双重选择门的语义相似度特征提取方法,其特征在于,所述第一循环神经网络,用于生成上下文信息的状态向量。2 . The semantic similarity feature extraction method based on double selection gate according to claim 1 , wherein the first recurrent neural network is used to generate a state vector of context information. 3 . 3.根据权利要求1所述的基于双重选择门的语义相似度特征提取方法,其特征在于,所述第一循环神经网络第一层为单项长短时记忆网络,第二层为双向长短时记忆网络,每个层级结构均包括多个相连的LSTM细胞模块。3. The semantic similarity feature extraction method based on double selection gate according to claim 1, is characterized in that, the first layer of the first recurrent neural network is a single-item long-short-term memory network, and the second layer is a bidirectional long-short-term memory network A network, each hierarchical structure consists of multiple connected LSTM cell modules. 4.根据权利要求3所述的基于双重选择门的语义相似度特征提取方法,其特征在于,4. the semantic similarity feature extraction method based on double selection gate according to claim 3, is characterized in that, 所述第一循环神经网络包括两个层级结构;The first recurrent neural network includes two hierarchical structures; 所述第一循环神经网络的第一层用于生成字词级别的向量;The first layer of the first recurrent neural network is used to generate word-level vectors; 所述第一循环神经网络的第二层用于生成上下文信息向量。The second layer of the first recurrent neural network is used to generate context information vectors. 5.根据权利要求1所述的基于双重选择门的语义相似度特征提取方法,其特征在于,所述一级选择门和二级选择门分别包括多个一级选择门单元和二级选择门单元;5. The semantic similarity feature extraction method based on double selection gate according to claim 1, characterized in that, the primary selection gate and the secondary selection gate respectively comprise a plurality of primary selection gate units and secondary selection gates unit; 6.根据权利要求3所述的基于双重选择门的语义相似度特征提取方法,其特征在于,6. The semantic similarity feature extraction method based on double selection gate according to claim 3, is characterized in that, 所述步骤S200中,将步骤S100得到的句子对的全部词向量按顺序输入第一循环网络,从而得到输入每个词后的句子状态向量,具体为:In the step S200, all the word vectors of the sentence pair obtained in the step S100 are input into the first cyclic network in sequence, so as to obtain the sentence state vector after each word is input, specifically: 将第i个所述词向量和第i-1时刻的输出词向量输入到第i个所述LSTM细胞模块中,经过第i个所述LSTM细胞模块处理得到第i个词向量后句子的状态向量。Input the i-th word vector and the output word vector at the i-1th time into the i-th LSTM cell module, and obtain the state of the sentence after the i-th word vector is processed by the i-th LSTM cell module vector. 7.根据权利要求5所述的基于双重选择门的语义相似度特征提取方法,其特征在于,7. The semantic similarity feature extraction method based on double selection gate according to claim 5, is characterized in that, 所述步骤S300中将句子对的句向量输入到一级选择门中,获取核心信息特征包括:In the step S300, the sentence vector of the sentence pair is input into the first-level selection gate, and the acquisition of core information features includes: 将句子P的每个时刻所述上下文信息向量和句子Q的第i个所述句向量输入到所述一级选择门单元中,经过第i个所述一级选择门单元处理得到核心信息。The context information vector at each moment of sentence P and the i-th sentence vector of sentence Q are input into the first-level selection gate unit, and the core information is obtained after processing by the i-th first-level selection gate unit. 8.根据权利要求1-7所述的基于双重选择门的语义相似度特征提取方法,其特征在于,8. the semantic similarity feature extraction method based on double selection gate according to claim 1-7, is characterized in that, 步骤S400中将步骤S300中得到的核心信息输入到二级选择门中,再次获取核心信息特征包括:In step S400, the core information obtained in step S300 is input into the secondary selection gate, and the characteristics of the core information obtained again include: 将所述第i个一级选择门单元处理得到的核心信息输入到第i个二级选择门单元中,经过第i个二级选择门单元处理得到核心信息特征。The core information processed by the i-th first-level selection gate unit is input into the i-th second-level selection gate unit, and the core information features are obtained through processing by the i-th second-level selection gate unit. 9.根据权利要求1-8所述的基于双重选择门的语义相似度特征提取方法,其特征在于,所述步骤S500中,将步骤S400获取到的核心信息输入到多角度语义匹配网络中,得到特征匹配向量包括:9. The semantic similarity feature extraction method based on double selection gate according to claim 1-8, wherein in the step S500, the core information obtained in the step S400 is input into the multi-angle semantic matching network, The obtained feature matching vector includes: 所述全匹配将句子P每个时刻所述上下文信息向量与句子Q所述句向量进行余弦相似计算,得到特征匹配向量;The full matching performs cosine similarity calculation on the context information vector described in sentence P and the sentence vector described in sentence Q at each moment to obtain a feature matching vector; 所述最大池化匹配将句子P每个时刻所述上下文信息向量与句子Q每个时刻所述上下文信息向量进行余弦相似计算,选取最大值作为特征匹配向量;The maximum pooling matching performs cosine similarity calculation on the context information vector at every moment of sentence P and the context information vector at every moment of sentence Q, and selects the maximum value as a feature matching vector; 所述注意力匹配将句子P第i时刻的所述上下文信息向量与句子Q第i时刻所述上下文信息向量分别进行余弦计算,得到句子P的i个余弦值,将i个余弦值加权作为注意力权重并与句子Q每个时刻所述上下文信息相乘,得到的结果再与句子P每个时刻所述的上下文信息向量进行余弦计算,得到特征匹配向量;The attention matching performs cosine calculation on the context information vector at the ith moment of sentence P and the context information vector at the ith moment of sentence Q, respectively, to obtain i cosine values of sentence P, and weight the i cosine values as attention. The force weight is multiplied with the context information described in each moment of sentence Q, and the obtained result is then subjected to cosine calculation with the context information vector described in each moment of sentence P to obtain a feature matching vector; 所述最大注意力匹配将句子P第i时刻的所述上下文信息向量与句子Q第i时刻所述的上下文信息向量分别进行余弦计算,得到句子P的i个余弦值,从i个余弦值中选取最大的值作为注意力权重,并与句子Q的所述上下文信息相乘,得到的结果再与句子P每个时刻所述的上下文信息向量进行余弦计算,得到特征匹配向量。The maximum attention matching performs cosine calculation on the context information vector at the i-th moment of sentence P and the context information vector at the i-th moment in sentence Q, respectively, to obtain i cosine values of sentence P, from the i cosine values. The largest value is selected as the attention weight and multiplied by the context information of sentence Q, and the result obtained is then cosine calculation with the context information vector described in sentence P at each moment to obtain the feature matching vector. 10.根据权利要求1-9所述的基于双重选择门的语义相似度特征提取方法,其特征在于,所述步骤S600将S500步骤得到的匹配向量通过第二神经网络,使特征匹配向量融合成一个固定长度的向量,并输入到预测层计算句子对的相似度概率分布包括:10. The semantic similarity feature extraction method based on the double selection gate according to claim 1-9, wherein the step S600 passes the matching vector obtained in the step S500 through the second neural network, so that the feature matching vector is fused into a A fixed-length vector and input to the prediction layer to calculate the probability distribution of the similarity of sentence pairs including: 将句子P经过四个匹配得到的四个特征匹配向量,经过所述第二循环神经网络聚合成一个固定长度的特征匹配向量;The four feature matching vectors obtained by the four matchings of the sentence P are aggregated into a fixed-length feature matching vector through the second cyclic neural network; 将句子Q也经过四个匹配得到的四个特征匹配向量,经过所述的双向长短时记忆网络聚合成一个固定长度的特征匹配向量;The four feature matching vectors obtained by the four matchings of the sentence Q are aggregated into a fixed-length feature matching vector through the bidirectional long-short-term memory network; 利用句子P和句子Q两个特征匹配向量输入到预测层,得到句子对相似度。The two feature matching vectors of sentence P and sentence Q are input to the prediction layer to obtain the sentence pair similarity.
CN201911032492.1A 2019-10-28 2019-10-28 A Semantic Similarity Feature Extraction Method Based on Double Selection Gate Pending CN110765755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911032492.1A CN110765755A (en) 2019-10-28 2019-10-28 A Semantic Similarity Feature Extraction Method Based on Double Selection Gate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911032492.1A CN110765755A (en) 2019-10-28 2019-10-28 A Semantic Similarity Feature Extraction Method Based on Double Selection Gate

Publications (1)

Publication Number Publication Date
CN110765755A true CN110765755A (en) 2020-02-07

Family

ID=69334325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911032492.1A Pending CN110765755A (en) 2019-10-28 2019-10-28 A Semantic Similarity Feature Extraction Method Based on Double Selection Gate

Country Status (1)

Country Link
CN (1) CN110765755A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339249A (en) * 2020-02-20 2020-06-26 齐鲁工业大学 A deep intelligent text matching method and device combining multi-angle features
CN111523241A (en) * 2020-04-28 2020-08-11 国网浙江省电力有限公司湖州供电公司 Construction method of a new type of electricity load logic information model
CN111523301A (en) * 2020-06-05 2020-08-11 泰康保险集团股份有限公司 Contract document compliance checking method and device
CN111651973A (en) * 2020-06-03 2020-09-11 拾音智能科技有限公司 Text matching method based on syntax perception
CN112434514A (en) * 2020-11-25 2021-03-02 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112560502A (en) * 2020-12-28 2021-03-26 桂林电子科技大学 Semantic similarity matching method and device and storage medium
CN113157889A (en) * 2021-04-21 2021-07-23 韶鼎人工智能科技有限公司 Visual question-answering model construction method based on theme loss
CN113177406A (en) * 2021-04-23 2021-07-27 珠海格力电器股份有限公司 Text processing method and device, electronic equipment and computer readable medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106547885A (en) * 2016-10-27 2017-03-29 桂林电子科技大学 A kind of Text Classification System and method
CN109101494A (en) * 2018-08-10 2018-12-28 哈尔滨工业大学(威海) A method of it is calculated for Chinese sentence semantic similarity, equipment and computer readable storage medium
CN109165300A (en) * 2018-08-31 2019-01-08 中国科学院自动化研究所 Text contains recognition methods and device
CN109214001A (en) * 2018-08-23 2019-01-15 桂林电子科技大学 A kind of semantic matching system of Chinese and method
CN109800390A (en) * 2018-12-21 2019-05-24 北京石油化工学院 A kind of calculation method and device of individualized emotion abstract
CN110162593A (en) * 2018-11-29 2019-08-23 腾讯科技(深圳)有限公司 A kind of processing of search result, similarity model training method and device
CN110298037A (en) * 2019-06-13 2019-10-01 同济大学 The matched text recognition method of convolutional neural networks based on enhancing attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106547885A (en) * 2016-10-27 2017-03-29 桂林电子科技大学 A kind of Text Classification System and method
CN109101494A (en) * 2018-08-10 2018-12-28 哈尔滨工业大学(威海) A method of it is calculated for Chinese sentence semantic similarity, equipment and computer readable storage medium
CN109214001A (en) * 2018-08-23 2019-01-15 桂林电子科技大学 A kind of semantic matching system of Chinese and method
CN109165300A (en) * 2018-08-31 2019-01-08 中国科学院自动化研究所 Text contains recognition methods and device
CN110162593A (en) * 2018-11-29 2019-08-23 腾讯科技(深圳)有限公司 A kind of processing of search result, similarity model training method and device
CN109800390A (en) * 2018-12-21 2019-05-24 北京石油化工学院 A kind of calculation method and device of individualized emotion abstract
CN110298037A (en) * 2019-06-13 2019-10-01 同济大学 The matched text recognition method of convolutional neural networks based on enhancing attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QINGYU ZHOU等: "Selective Encoding for Abstractive Sentence Summarization", 《ARXIV:1704.07073V1》, 24 April 2017 (2017-04-24), pages 4 *
ZHIGUO WANG等: "Bilateral Multi-Perspective Matching for Natural Language Sentences", 《ARXIV:1702.03814V3》, 14 July 2017 (2017-07-14), pages 3 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339249A (en) * 2020-02-20 2020-06-26 齐鲁工业大学 A deep intelligent text matching method and device combining multi-angle features
CN111523241A (en) * 2020-04-28 2020-08-11 国网浙江省电力有限公司湖州供电公司 Construction method of a new type of electricity load logic information model
CN111523241B (en) * 2020-04-28 2023-06-13 国网浙江省电力有限公司湖州供电公司 Construction method of power load logic information model
CN111651973A (en) * 2020-06-03 2020-09-11 拾音智能科技有限公司 Text matching method based on syntax perception
CN111651973B (en) * 2020-06-03 2023-11-07 拾音智能科技有限公司 Text matching method based on syntactic perception
CN111523301A (en) * 2020-06-05 2020-08-11 泰康保险集团股份有限公司 Contract document compliance checking method and device
CN112434514B (en) * 2020-11-25 2022-06-21 重庆邮电大学 Semantic matching method, device and computer equipment based on multi-granularity and multi-channel neural network
CN112434514A (en) * 2020-11-25 2021-03-02 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112560502B (en) * 2020-12-28 2022-05-13 桂林电子科技大学 A semantic similarity matching method, device and storage medium
CN112560502A (en) * 2020-12-28 2021-03-26 桂林电子科技大学 Semantic similarity matching method and device and storage medium
CN113157889A (en) * 2021-04-21 2021-07-23 韶鼎人工智能科技有限公司 Visual question-answering model construction method based on theme loss
CN113177406A (en) * 2021-04-23 2021-07-27 珠海格力电器股份有限公司 Text processing method and device, electronic equipment and computer readable medium
CN113177406B (en) * 2021-04-23 2023-07-07 珠海格力电器股份有限公司 Text processing method, text processing device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
CN110765755A (en) A Semantic Similarity Feature Extraction Method Based on Double Selection Gate
CN110826337B (en) A Short Text Semantic Training Model Acquisition Method and Similarity Matching Algorithm
CN107992597B (en) A text structuring method for grid fault cases
CN109726389B (en) Chinese missing pronoun completion method based on common sense and reasoning
Zhang et al. Keywords extraction with deep neural network model
CN112100351A (en) A method and device for constructing an intelligent question answering system through question generation data sets
CN112541356B (en) Method and system for recognizing biomedical named entities
CN112232053B (en) Text similarity computing system, method and storage medium based on multi-keyword pair matching
CN110807084A (en) Attention mechanism-based patent term relationship extraction method for Bi-LSTM and keyword strategy
CN112818118B (en) Reverse translation-based Chinese humor classification model construction method
CN108874896B (en) Humor identification method based on neural network and humor characteristics
CN112163425A (en) Text entity relation extraction method based on multi-feature information enhancement
CN109783806B (en) Text matching method utilizing semantic parsing structure
CN114298055B (en) Retrieval method and device based on multilevel semantic matching, computer equipment and storage medium
CN111723572B (en) Chinese short text correlation measurement method based on CNN convolutional layer and BilSTM
CN113761890A (en) A Multi-level Semantic Information Retrieval Method Based on BERT Context Awareness
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN111400494A (en) A sentiment analysis method based on GCN-Attention
CN111581365A (en) Predicate extraction method
WO2023004528A1 (en) Distributed system-based parallel named entity recognition method and apparatus
CN111191464A (en) Semantic similarity calculation method based on combined distance
CN112307179A (en) Text matching method, apparatus, device and storage medium
CN114428850A (en) Text retrieval matching method and system
Sun et al. Text classification algorithm based on tf-idf and bert
CN111274359A (en) Query recommendation method and system based on improved VHRED and reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200207

RJ01 Rejection of invention patent application after publication