CN110765755A - Semantic similarity feature extraction method based on double selection gates - Google Patents

Semantic similarity feature extraction method based on double selection gates Download PDF

Info

Publication number
CN110765755A
CN110765755A CN201911032492.1A CN201911032492A CN110765755A CN 110765755 A CN110765755 A CN 110765755A CN 201911032492 A CN201911032492 A CN 201911032492A CN 110765755 A CN110765755 A CN 110765755A
Authority
CN
China
Prior art keywords
sentence
vector
matching
context information
ith
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911032492.1A
Other languages
Chinese (zh)
Inventor
蔡晓东
秦菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN201911032492.1A priority Critical patent/CN110765755A/en
Publication of CN110765755A publication Critical patent/CN110765755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a semantic similarity feature extraction method based on double selection gates, which relates to the field of natural language processing. The method effectively relieves the problem of low matching efficiency caused by information redundancy, and simultaneously avoids the cost problem of manually extracting the core information.

Description

Semantic similarity feature extraction method based on double selection gates
Technical Field
The invention relates to the field of natural language processing, in particular to a semantic similarity feature extraction method based on a double selection gate.
Background
The world is full of massive information, most of the information is stored in the form of texts, and an important subject of artificial intelligence is to arrange the text information into an expression so that a computer can understand the information like a human being. Because many words in a language have multiple meanings, the same concept can be expressed in different ways and other uncertain factors exist, the traditional text similarity calculation method based on character string matching is in a search engine, a question and answer system and the like, and the user requirements are difficult to meet, when a user inputs keywords to search information matched with the keywords, the contents fed back by searching may correspond to non-conforming contents, only a few contents may conform to the searched keywords, and extreme invariance is brought to the user, so that the calculation of text similarity through deeper semantic understanding becomes a hotspot of current natural language research.
In the prior art, a plurality of sentence semantic similarity matching methods are provided, and basically focus on matching character strings at first, the basic flow is generally divided into two steps, firstly, two sentences of which the similarity needs to be judged are input into a circulating network and mapped into vector representation, and then the obtained two sentence vectors are used for judging the similarity of the two sentences through cosine distance. Although the traditional character string method is adopted to judge the similarity of sentence pairs to a certain extent, people are helped to filter out some irrelevant information when searching for relevant problems, the search result is still unsatisfactory in quality. Because the similarity between sentences judged by character strings is only the distance between words calculated at the word level, and no context semantic information exists, information is mismatched and ambiguous, and finally a user cannot quickly find related information of keywords.
Therefore, it is necessary to invent a new semantic similarity feature extraction method.
Disclosure of Invention
The invention aims to provide a semantic similarity feature extraction method based on a double selection gate, which can automatically judge the semantic similarity of two sentences, effectively reduce redundant information of the sentences through double automatic selection of core information, and improve the accuracy and judgment efficiency of the sentence similarity.
The technical scheme is as follows:
s100, carrying out word segmentation on P and Q in sentences to be processed, and carrying out vectorization representation on words subjected to word segmentation to obtain word vectors;
s200, all word vectors of the sentence pairs P and Q obtained in the step S100 are input into a first recurrent neural network in sequence to obtain context information vectors, wherein the last context information vector of the sentence represents a sentence vector of the sentence;
s300, inputting sentence vectors of the sentence pairs P and Q into a primary selection gate to obtain core information characteristics;
s400, inputting the core information obtained in the step S300 into a secondary selection gate, and acquiring the core information characteristics again;
s500, inputting the core information acquired in the step S400 into a multi-angle semantic matching network, wherein the multi-angle semantic matching network comprises four modes of full matching, maximum pooling matching, attention matching and maximum attention matching to obtain feature matching vectors of sentence pairs;
and S600, fusing the feature matching vectors obtained in the step S500 into a vector with a fixed length through a second neural network, and inputting the vector into a prediction layer to calculate the similarity probability distribution of sentence pairs.
Preferably, the first recurrent neural network is configured to generate a state vector of context information.
Preferably, the first layer of the first recurrent neural network is a single long-term and short-term memory network, the second layer of the first recurrent neural network is a bidirectional long-term and short-term memory network, and each hierarchical structure comprises a plurality of connected LSTM cell modules.
Preferably, the first recurrent neural network comprises two hierarchies;
a first layer of the first recurrent neural network is used to generate word-level vectors;
a second layer of the first recurrent neural network is used to generate a context information vector.
Preferably, the first-stage selection gate and the second-stage selection gate respectively comprise a plurality of first-stage selection gate units and second-stage selection gate units;
the primary selection gate and the secondary selection gate are different in structure and different in parameter.
Preferably, in step S200, all word vectors of the sentence pairs obtained in step S100 are sequentially input to the first cyclic network, so as to obtain a sentence state vector after each word is input, specifically:
and inputting the ith word vector and the output word vector at the ith-1 moment into the ith LSTM cell module, and processing the ith word vector and the output word vector by the ith LSTM cell module to obtain the state vector of the sentence after the ith word vector.
Preferably, in step S300, the inputting a sentence vector of a sentence pair into the first-level selection gate, and the acquiring the core information feature includes:
and inputting the context information vector at each moment of the sentence P and the ith sentence vector of the sentence Q into the first-level selection gate unit, and processing the context information vector and the ith sentence vector by the first-level selection gate unit to obtain core information.
Preferably, the step S400 of inputting the core information obtained in the step S300 into the second-level selection gate, and the step of acquiring the core information feature again includes:
and inputting the core information processed by the ith primary selection gate unit into the ith secondary selection gate unit, and processing the core information by the ith secondary selection gate unit to obtain the core information characteristics.
Preferably, in step S500, the step of inputting the core information acquired in step S400 into a multi-angle semantic matching network to obtain a feature matching vector includes:
the full matching carries out cosine similarity calculation on the context information vector at each moment of the sentence P and the sentence vector of the sentence Q to obtain a feature matching vector;
the maximum pooling matching is used for performing cosine similarity calculation on the context information vector at each moment of the sentence P and the context information vector at each moment of the sentence Q, and selecting the maximum value as a feature matching vector;
the attention matching carries out cosine calculation on the context information vector at the ith moment of the sentence P and the context information vector at the ith moment of the sentence Q respectively to obtain i cosine values of the sentence P, the i cosine values are weighted to be taken as attention weights and are multiplied by the context information at each moment of the sentence Q, and the obtained result is further subjected to cosine calculation with the context information vector at each moment of the sentence P to obtain a feature matching vector;
the maximum attention matching respectively performs cosine calculation on the context information vector at the ith moment of the sentence P and the context information vector at the ith moment of the sentence Q to obtain i cosine values of the sentence P, the maximum value is selected from the i cosine values to be taken as the attention weight and is multiplied by the context information of the sentence Q, and the obtained result is subjected to cosine calculation with the context information vector at each moment of the sentence P to obtain a feature matching vector.
Preferably, the second neural network comprises two bidirectional long-time memory networks, and is used for processing the feature matching vectors of sentence pairs and aggregating the feature matching vectors into a vector with a fixed length.
Preferably, the step S600 of fusing the matching vectors obtained in the step S500 into a vector with a fixed length by passing the matching vectors through a second neural network, and inputting the vector into the prediction layer to calculate the probability distribution of similarity between sentence pairs includes:
aggregating four feature matching vectors obtained by four matching of the sentence P into a feature matching vector with a fixed length through the second recurrent neural network;
aggregating four feature matching vectors obtained by four matching of the sentence Q into a feature matching vector with a fixed length through the bidirectional long-short time memory network;
and inputting the two feature matching vectors of the sentence P and the sentence Q into a prediction layer to obtain the sentence pair similarity.
Preferably, Word2Vec is adopted in step S100 to perform vectorization representation on the Word after the Jieba Word segmentation processing. Word2Vec is a prediction model that can efficiently learn embedded words, and the basic idea of Word2Vec is to represent each Word in natural language as a short vector with unified meaning and unified dimension.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
1. according to the semantic similarity feature extraction method based on the double selection gates, the core information in the sentences is automatically acquired without manually removing redundant information, the semantic similarity of the two sentences can be automatically judged through the semantic similarity model, the sentence similarity judgment accuracy and efficiency are higher through the semantic similarity model, and a user can be helped to find a more matched result in a question-answering or search system.
2. The semantic similarity feature extraction method based on the double selection gates utilizes a bidirectional long-time and short-time memory network to carry out context information vectorization expression on sentences. The network has a long-distance dependence relationship of a cell state capable of capturing texts, can remember a long-term state, realizes the updating, forgetting and filtering of information, better expresses a context relationship, and can solve the problems of gradient disappearance and explosion of the network. Conventional RNN networks connect past outputs and current inputs together and control both outputs by activating functions, which can only take into account the state at the most recent time.
3. According to the semantic similarity feature extraction method based on the double selection gates, the core semantic information in the sentence is automatically acquired by utilizing the two selection gates, so that the influence of redundant information on the judgment of the semantic similarity of the sentence is avoided, and the matching efficiency is improved.
4. The semantic similarity feature extraction method based on the double selection gates utilizes the multi-angle semantic matching network to perform four matching modes of full matching, maximum pooling matching, attention matching and maximum attention matching on two sentences, fully utilizes the context information vectors to perform multi-angle more detailed matching in the four matching modes, effectively avoids the problem that the accuracy of similarity judgment is low only through the cosine distance between two sentence words in the traditional method, and adopts the two-way long-short time memory network to fuse the matching vectors into the city fixed length vectors, effectively controls the dimensionality of the matching vectors and is beneficial to the calculation of the similarity of sentence pairs of a prediction layer.
5. The semantic similarity feature extraction method based on the double selection gates can effectively improve the judgment accuracy and efficiency of the semantic similarity of the sentences, and is suitable for Chinese and English sentence pair corpora.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention.
FIG. 2 is a block diagram of a dual select gate module according to an embodiment of the present invention.
FIG. 3 is a diagram of a multi-angle semantic matching network structure according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. Of course, the specific embodiments described herein are merely illustrative of the invention and are not intended to be limiting.
It should be noted that the embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
Example 1
Referring to fig. 1, the present invention provides a semantic similarity feature extraction method based on a dual-selection gate, including:
s100, carrying out word segmentation on P and Q in the sentences to be processed, and carrying out vectorization representation on the words subjected to word segmentation to obtain word vectors.
The word segmentation in step S100 is a process of segmenting words in a sentence into reasonable word sequences conforming to the context meaning, and is one of key technologies and difficulties in natural language understanding and text information processing, and is also an important processing link in a semantic similarity model. The Chinese word segmentation problem is complex because there is no obvious mark between words, the words are flexible to use, varied, rich in semantics and easy to generate ambiguity. According to research, the main difficulties of Chinese text word segmentation based on statistics are ambiguity resolution, inherent nouns and new word discovery, the invention adopts Jieba to segment Chinese texts, adopts Nltk to segment English texts, thereby improving word segmentation accuracy.
Models for vectorizing words include One-hot models and Distributed models. The One-hot model is simple, but the dimension cannot be controlled, and the relation between words cannot be well represented, so that the method adopts the Distributed model, and particularly adopts Word2Vec to vectorize the words.
S200, all word vectors of the sentence pairs P and Q obtained in the step S100 are input into a first recurrent neural network in sequence to obtain context information vectors, wherein the last context information vector of the sentence represents a sentence vector of the sentence;
the first recurrent neural network is used for generating a state vector of the context information; the first recurrent neural network comprises two hierarchical structures, wherein the first layer is a single long-term and short-term memory network and is used for generating word-level vectors; the second layer is a bidirectional long-time and short-time memory network and is used for generating context information vectors; each hierarchy comprising a plurality of linked LSTM cell modules; the module parameters at different hierarchies are different in order to generate the word level and context information vectors.
Inputting all word vectors of the sentence pairs obtained in step S100 into the first cyclic network in sequence, thereby obtaining a sentence state vector after each word is input, specifically:
and inputting the ith word vector and the output word vector at the (i-1) th moment into the ith LSTM cell module, and processing the ith word vector and the output word vector by the ith LSTM cell module to obtain the state vector of the sentence after the ith word vector.
S300, inputting sentence vectors of the sentence pairs P and Q into a primary selection gate to obtain core information characteristics;
specifically, a context information vector at each moment of the sentence P and an ith sentence vector of the sentence Q are input into a first-level selection gate unit, and the core information is obtained through the processing of the ith first-level selection gate unit.
S400, inputting the core information obtained in the step S300 into a secondary selection gate, and acquiring the core information characteristics again; specifically, the core information obtained by processing of the ith primary selection gate unit is input into the ith secondary selection gate unit, and the core information characteristics are obtained by processing of the ith secondary selection gate unit.
The first-stage selection gate and the second-stage selection gate respectively comprise a plurality of first-stage selection gate units and second-stage selection gate units;
the first-level selection gate and the second-level selection gate are different in structure and different in parameter.
S500, inputting the core information acquired in the step S400 into a multi-angle semantic matching network, wherein the multi-angle semantic matching network comprises four modes of full matching, maximum pooling matching, attention matching and maximum attention matching to obtain feature matching vectors of sentence pairs; in particular to a method for preparing a high-performance nano-silver alloy,
performing cosine similarity calculation on the context information vector of each moment of the sentence P and the sentence vector of the sentence Q to obtain a feature matching vector by full matching;
performing maximum pooling matching on the context information vector at each moment of the sentence P and the context information vector at each moment of the sentence Q for cosine similarity calculation, and selecting the maximum value as a feature matching vector;
the attention matching carries out cosine calculation on the context information vector of the ith moment of the sentence P and the context information vector of the ith moment of the sentence Q respectively to obtain i cosine values of the sentence P, the i cosine values are weighted to be taken as attention weights and are multiplied by the context information of each moment of the sentence Q, and the obtained result is subjected to cosine calculation with the context information vector of each moment of the sentence P to obtain a feature matching vector;
the maximum attention matching carries out cosine calculation on the context information vector of the ith moment of the sentence P and the context information vector of the ith moment of the sentence Q respectively to obtain i cosine values of the sentence P, the maximum value is selected from the i cosine values to serve as the attention weight and is multiplied by the context information of the sentence Q, and the obtained result is further subjected to cosine calculation with the context information vector of each moment of the sentence P to obtain a feature matching vector.
The second neural network comprises two bidirectional long-time and short-time memory networks and is used for processing the feature matching vectors of sentence pairs and aggregating the feature matching vectors into a vector with a fixed length.
S600, the matching vector obtained in the step S500 is passed through a second neural network, so that the feature matching vector is fused into a vector with a fixed length, and the vector is input into a prediction layer to calculate the similarity probability distribution of sentence pairs, specifically,
aggregating four feature matching vectors obtained by four matching of the sentence P into a feature matching vector with a fixed length through a second recurrent neural network;
aggregating four feature matching vectors obtained by four matching of the sentence Q and a passing bidirectional long-and-short time memory network into a feature matching vector with a fixed length;
and inputting the two feature matching vectors of the sentence P and the sentence Q into a prediction layer to obtain the sentence pair similarity.
In step S100, Word2Vec is used to perform vectorization representation on the words subjected to Jieba Word segmentation processing.
Example 2
On the basis of the embodiment 1, the first recurrent neural network is composed of a layer of unidirectional LSTM network and a layer of bidirectional LSTM network, each layer comprises a plurality of connected LSTM cell modules, and current input information and output information at the previous moment are processed according to an input gate, a forgetting gate, an updating gate and a filtering output gate in the LSTM cell modules. The first layer of the first recurrent neural network includes a plurality of connected unidirectional LSTM cell modules for deriving a state vector for each word. The second layer of the first recurrent neural network includes a plurality of connected bi-directional LSTM cell modules for sentence-to-sentence context information vectors.
In the method, firstly, words and context information of a sentence are modeled through a first recurrent neural network, and a state vector of each word of the sentence at a corresponding moment and a context information vector of the sentence at each moment are obtained. As shown in fig. 2, in the step S200, a Long Short Term memory network (LSTM) is used in the first recurrent neural network, and a calculation formula of the network is as follows:
ft=σ(Wfwt+Ufht-1+bf);
it=σ(Wiwt+Uiht-1+bi);
ot=σ(Wowt+Uoht-1+bo);
Figure BDA0002250550770000071
Figure BDA0002250550770000072
ht=ottanh(ct);
in the above formula ftIs the output of the forgetting gate; i.e. itIs the output of the input gate; otIs the output of the output gate; wf、Wi、Wo、Wc、bf、bi、bo、bcThe weight matrixes and the offset vectors are forgetting gates, input gates, output gates and selection gates;new memory information; c. CtFor updating memory content of LSTM network unit, sigma is sigmoid function, ⊙ is element product, h ist-1For hidden layer output at time t-1, WtIs the input information at time t.
In the method of the invention, because the context of the sentence is modeled by the recurrent neural network, the state vector of the corresponding sentence after the word is input at the time t theoretically contains the information of all the words before the time, that is, the state vector h of the sentence obtained after the last word is inputnContains all the information of the whole sentence, therefore hnA state vector representing the entire sentence, i.e., a sentence vector.
Example 3
On the basis of embodiment 1 or 2, the double selection gate comprises two selection gate structures, and the two selection gate structures are different and parameters are also different. Through different selection gates, the method is beneficial to filtering out redundant information in sentences and more accurately acquiring core information. The first floor select gate calculation formula is as follows:
s=hn
sGatei=σ(Wshi+Uss+b);
Figure BDA0002250550770000081
in the above formula, the sentence vector is constructed by using the context hidden vector of the sentence, and the hidden layer h of the sentence is takennAs sentence vectors s, sGateiIs a gate vector, WsAnd UsIs a weight matrix, b is a bias vector, σ is a sigmoid activation function,
Figure BDA0002250550770000082
is a dot product between elements.
The second layer selection gate calculates the context vector at the time t, and utilizes the sentence vector at the previous time and the hidden state h 'of the selection gate'iCalculating the weight of the selection gate, and finally normalizing the weight of the selection gate, wherein the calculation formula is as follows:
ei,j=va Ttanh(Wast-1+Uah'i);
Figure BDA0002250550770000083
Figure BDA0002250550770000084
h 'in the formula'iA context implicit vector;
Figure BDA0002250550770000085
as a weight matrix, ai,jThe normalization is selected for the selection of the gate,
Figure BDA0002250550770000086
as a core of the k-th statementThe heart feature vector, k 1,2, is the number of sentences in the text.
Referring to fig. 2, the sentence P is P ═ P1,p2,...,pi,...,pn]The term Q denotes Q ═ Q1,q2,...,qi,...,qm]Representing the input sentence pair sequence, inputting words by the model once, obtaining the context information vector representation of each moment of the sentence through the step S200, and obtaining the implicit vector expression matrix of the P sentence context
Figure BDA0002250550770000087
Context vector expression matrix for sum Q statements
Figure BDA0002250550770000088
Obtaining core information through two layers of selection gates in steps S300 and S400, and obtaining statement P core characteristic feature expression By analogy, statement Q expression
Figure BDA00022505507700000811
The method of the invention obtains the context information vector of the sentence through the recurrent neural network, thereby enabling the context semantic relevance of the two sentences to be stronger and better judging the semantic similarity of the two sentences.
As shown in fig. 3, the second recurrent neural network is a bidirectional LSTM neural network, which includes a plurality of bidirectional LSTM cell modules connected together. In order to change the feature matching vector generated by the multi-angle matching network into a vector with a fixed length and input the vector into the prediction layer, the matching vector needs to be input into the bidirectional LSTM network and fused into a vector with a fixed length.
In order to obtain the similarity judgment of two sentences, a second cyclic neural network is used, four feature matching vectors of a sentence P and a sentence Q are input into the second cyclic neural network and fused to obtain a fixed-length vector, the four feature matching vectors of the sentence Q and the sentence P are operated in the same way to respectively obtain two fixed-length matching vectors, and the vectors are input into a prediction layer to obtain the sentence pair similarity probability distribution.
The sentence semantic similarity determined by the method of the invention automatically extracts the core information characteristics from the sentences as the input of the matching network besides using the context information between the sentences, thereby improving the matching accuracy, reducing the processing of the matching network on the redundant information and improving the matching efficiency. For some words with the same meaning and different expression forms in the sentence, similarity can be judged through models, for example, two words of 'computer' and 'computer', when similarity judgment is carried out on the two words, not only the distance between the words is considered, but also the context information of the sentence where the words are located is used for judging the similarity.
The present invention is not limited to the above preferred embodiments, and any modifications, equivalent replacements, improvements, etc. within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A semantic similarity feature extraction method based on double selection gates is characterized by comprising the following steps
S100, carrying out word segmentation on P and Q in sentences to be processed, and carrying out vectorization representation on words subjected to word segmentation to obtain word vectors;
s200, all word vectors of the sentence pairs P and Q obtained in the step S100 are input into a first recurrent neural network in sequence to obtain context information vectors, wherein the last context information vector of the sentence represents a sentence vector of the sentence;
s300, inputting sentence vectors of the sentence pairs P and Q into a primary selection gate to obtain core information characteristics;
s400, inputting the core information obtained in the step S300 into a secondary selection gate, and acquiring the core information characteristics again;
s500, inputting the core information acquired in the step S400 into a multi-angle semantic matching network, wherein the multi-angle semantic matching network comprises four modes of full matching, maximum pooling matching, attention matching and maximum attention matching to obtain feature matching vectors of sentence pairs;
and S600, fusing the feature matching vectors obtained in the step S500 into a vector with a fixed length through a second neural network, and inputting the vector into a prediction layer to calculate the similarity probability distribution of sentence pairs.
2. The dual-selection-gate-based semantic similarity feature extraction method according to claim 1, wherein the first recurrent neural network is used for generating a state vector of context information.
3. The semantic similarity feature extraction method based on the dual selection gates as claimed in claim 1, wherein the first recurrent neural network is a single long-term and short-term memory network at the first layer and a bidirectional long-term and short-term memory network at the second layer, and each hierarchical structure comprises a plurality of connected LSTM cell modules.
4. The semantic similarity feature extraction method based on dual-choice gates according to claim 3,
the first recurrent neural network comprises two hierarchies;
a first layer of the first recurrent neural network is used to generate word-level vectors;
a second layer of the first recurrent neural network is used to generate a context information vector.
5. The semantic similarity feature extraction method based on the dual selection gates as claimed in claim 1, wherein the primary selection gate and the secondary selection gate respectively comprise a plurality of primary selection gate units and secondary selection gate units;
6. the semantic similarity feature extraction method based on dual-choice gates according to claim 3,
in step S200, all word vectors of the sentence pairs obtained in step S100 are sequentially input to the first cyclic network, so as to obtain a sentence state vector after each word is input, specifically:
and inputting the ith word vector and the output word vector at the ith-1 moment into the ith LSTM cell module, and processing the ith word vector and the output word vector by the ith LSTM cell module to obtain the state vector of the sentence after the ith word vector.
7. The semantic similarity feature extraction method based on dual-choice gates according to claim 5,
in step S300, the sentence vector of the sentence pair is input into the first-level selection gate, and obtaining the core information features includes:
and inputting the context information vector at each moment of the sentence P and the ith sentence vector of the sentence Q into the first-level selection gate unit, and processing the context information vector and the ith sentence vector by the first-level selection gate unit to obtain core information.
8. The semantic similarity feature extraction method based on dual selection gates according to claims 1-7,
in step S400, the core information obtained in step S300 is input into the secondary selection gate, and the obtaining the core information feature again includes:
and inputting the core information processed by the ith primary selection gate unit into the ith secondary selection gate unit, and processing the core information by the ith secondary selection gate unit to obtain the core information characteristics.
9. The method for extracting semantic similarity features based on dual selection gates according to claims 1-8, wherein in the step S500, the step of inputting the core information acquired in the step S400 into a multi-angle semantic matching network to obtain a feature matching vector comprises:
the full matching carries out cosine similarity calculation on the context information vector at each moment of the sentence P and the sentence vector of the sentence Q to obtain a feature matching vector;
the maximum pooling matching is used for performing cosine similarity calculation on the context information vector at each moment of the sentence P and the context information vector at each moment of the sentence Q, and selecting the maximum value as a feature matching vector;
the attention matching carries out cosine calculation on the context information vector at the ith moment of the sentence P and the context information vector at the ith moment of the sentence Q respectively to obtain i cosine values of the sentence P, the i cosine values are weighted to be taken as attention weights and are multiplied by the context information at each moment of the sentence Q, and the obtained result is further subjected to cosine calculation with the context information vector at each moment of the sentence P to obtain a feature matching vector;
the maximum attention matching respectively performs cosine calculation on the context information vector at the ith moment of the sentence P and the context information vector at the ith moment of the sentence Q to obtain i cosine values of the sentence P, the maximum value is selected from the i cosine values to be taken as the attention weight and is multiplied by the context information of the sentence Q, and the obtained result is subjected to cosine calculation with the context information vector at each moment of the sentence P to obtain a feature matching vector.
10. The method for semantic similarity feature extraction based on dual-choice gates according to claims 1-9, wherein the step S600 of passing the matching vector obtained in step S500 through a second neural network to fuse the feature matching vector into a vector with a fixed length, and inputting the vector into a prediction layer to calculate the probability distribution of similarity of sentence pairs comprises:
aggregating four feature matching vectors obtained by four matching of the sentence P into a feature matching vector with a fixed length through the second recurrent neural network;
aggregating four feature matching vectors obtained by four matching of the sentence Q into a feature matching vector with a fixed length through the bidirectional long-short time memory network;
and inputting the two feature matching vectors of the sentence P and the sentence Q into a prediction layer to obtain the sentence pair similarity.
CN201911032492.1A 2019-10-28 2019-10-28 Semantic similarity feature extraction method based on double selection gates Pending CN110765755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911032492.1A CN110765755A (en) 2019-10-28 2019-10-28 Semantic similarity feature extraction method based on double selection gates

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911032492.1A CN110765755A (en) 2019-10-28 2019-10-28 Semantic similarity feature extraction method based on double selection gates

Publications (1)

Publication Number Publication Date
CN110765755A true CN110765755A (en) 2020-02-07

Family

ID=69334325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911032492.1A Pending CN110765755A (en) 2019-10-28 2019-10-28 Semantic similarity feature extraction method based on double selection gates

Country Status (1)

Country Link
CN (1) CN110765755A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339249A (en) * 2020-02-20 2020-06-26 齐鲁工业大学 Deep intelligent text matching method and device combining multi-angle features
CN111523241A (en) * 2020-04-28 2020-08-11 国网浙江省电力有限公司湖州供电公司 Method for constructing novel power load logic information model
CN111523301A (en) * 2020-06-05 2020-08-11 泰康保险集团股份有限公司 Contract document compliance checking method and device
CN111651973A (en) * 2020-06-03 2020-09-11 拾音智能科技有限公司 Text matching method based on syntax perception
CN112434514A (en) * 2020-11-25 2021-03-02 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112560502A (en) * 2020-12-28 2021-03-26 桂林电子科技大学 Semantic similarity matching method and device and storage medium
CN113157889A (en) * 2021-04-21 2021-07-23 韶鼎人工智能科技有限公司 Visual question-answering model construction method based on theme loss
CN113177406A (en) * 2021-04-23 2021-07-27 珠海格力电器股份有限公司 Text processing method and device, electronic equipment and computer readable medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106547885A (en) * 2016-10-27 2017-03-29 桂林电子科技大学 A kind of Text Classification System and method
CN109101494A (en) * 2018-08-10 2018-12-28 哈尔滨工业大学(威海) A method of it is calculated for Chinese sentence semantic similarity, equipment and computer readable storage medium
CN109165300A (en) * 2018-08-31 2019-01-08 中国科学院自动化研究所 Text contains recognition methods and device
CN109214001A (en) * 2018-08-23 2019-01-15 桂林电子科技大学 A kind of semantic matching system of Chinese and method
CN109800390A (en) * 2018-12-21 2019-05-24 北京石油化工学院 A kind of calculation method and device of individualized emotion abstract
CN110162593A (en) * 2018-11-29 2019-08-23 腾讯科技(深圳)有限公司 A kind of processing of search result, similarity model training method and device
CN110298037A (en) * 2019-06-13 2019-10-01 同济大学 The matched text recognition method of convolutional neural networks based on enhancing attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106547885A (en) * 2016-10-27 2017-03-29 桂林电子科技大学 A kind of Text Classification System and method
CN109101494A (en) * 2018-08-10 2018-12-28 哈尔滨工业大学(威海) A method of it is calculated for Chinese sentence semantic similarity, equipment and computer readable storage medium
CN109214001A (en) * 2018-08-23 2019-01-15 桂林电子科技大学 A kind of semantic matching system of Chinese and method
CN109165300A (en) * 2018-08-31 2019-01-08 中国科学院自动化研究所 Text contains recognition methods and device
CN110162593A (en) * 2018-11-29 2019-08-23 腾讯科技(深圳)有限公司 A kind of processing of search result, similarity model training method and device
CN109800390A (en) * 2018-12-21 2019-05-24 北京石油化工学院 A kind of calculation method and device of individualized emotion abstract
CN110298037A (en) * 2019-06-13 2019-10-01 同济大学 The matched text recognition method of convolutional neural networks based on enhancing attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QINGYU ZHOU等: "Selective Encoding for Abstractive Sentence Summarization", 《ARXIV:1704.07073V1》, 24 April 2017 (2017-04-24), pages 4 *
ZHIGUO WANG等: "Bilateral Multi-Perspective Matching for Natural Language Sentences", 《ARXIV:1702.03814V3》, 14 July 2017 (2017-07-14), pages 3 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339249A (en) * 2020-02-20 2020-06-26 齐鲁工业大学 Deep intelligent text matching method and device combining multi-angle features
CN111523241A (en) * 2020-04-28 2020-08-11 国网浙江省电力有限公司湖州供电公司 Method for constructing novel power load logic information model
CN111523241B (en) * 2020-04-28 2023-06-13 国网浙江省电力有限公司湖州供电公司 Construction method of power load logic information model
CN111651973A (en) * 2020-06-03 2020-09-11 拾音智能科技有限公司 Text matching method based on syntax perception
CN111651973B (en) * 2020-06-03 2023-11-07 拾音智能科技有限公司 Text matching method based on syntactic perception
CN111523301A (en) * 2020-06-05 2020-08-11 泰康保险集团股份有限公司 Contract document compliance checking method and device
CN112434514B (en) * 2020-11-25 2022-06-21 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112434514A (en) * 2020-11-25 2021-03-02 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112560502B (en) * 2020-12-28 2022-05-13 桂林电子科技大学 Semantic similarity matching method and device and storage medium
CN112560502A (en) * 2020-12-28 2021-03-26 桂林电子科技大学 Semantic similarity matching method and device and storage medium
CN113157889A (en) * 2021-04-21 2021-07-23 韶鼎人工智能科技有限公司 Visual question-answering model construction method based on theme loss
CN113177406A (en) * 2021-04-23 2021-07-27 珠海格力电器股份有限公司 Text processing method and device, electronic equipment and computer readable medium
CN113177406B (en) * 2021-04-23 2023-07-07 珠海格力电器股份有限公司 Text processing method, text processing device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN110765755A (en) Semantic similarity feature extraction method based on double selection gates
CN110210037B (en) Syndrome-oriented medical field category detection method
CN107291693B (en) Semantic calculation method for improved word vector model
CN111027595B (en) Double-stage semantic word vector generation method
CN112347268A (en) Text-enhanced knowledge graph joint representation learning method and device
WO2019080863A1 (en) Text sentiment classification method, storage medium and computer
CN110969020A (en) CNN and attention mechanism-based Chinese named entity identification method, system and medium
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN104834747A (en) Short text classification method based on convolution neutral network
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN112163425A (en) Text entity relation extraction method based on multi-feature information enhancement
CN112232053B (en) Text similarity computing system, method and storage medium based on multi-keyword pair matching
CN110532395B (en) Semantic embedding-based word vector improvement model establishing method
CN111581364B (en) Chinese intelligent question-answer short text similarity calculation method oriented to medical field
CN114428850B (en) Text retrieval matching method and system
CN112417170B (en) Relationship linking method for incomplete knowledge graph
CN111639165A (en) Intelligent question-answer optimization method based on natural language processing and deep learning
CN114254645A (en) Artificial intelligence auxiliary writing system
CN114691864A (en) Text classification model training method and device and text classification method and device
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
Li et al. Multimodal fusion with co-attention mechanism
Yang et al. Text classification based on convolutional neural network and attention model
CN114757184A (en) Method and system for realizing knowledge question answering in aviation field
CN114282592A (en) Deep learning-based industry text matching model method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200207