CN112560502B - Semantic similarity matching method and device and storage medium - Google Patents
Semantic similarity matching method and device and storage medium Download PDFInfo
- Publication number
- CN112560502B CN112560502B CN202011579327.0A CN202011579327A CN112560502B CN 112560502 B CN112560502 B CN 112560502B CN 202011579327 A CN202011579327 A CN 202011579327A CN 112560502 B CN112560502 B CN 112560502B
- Authority
- CN
- China
- Prior art keywords
- vector matrix
- fusion
- vector
- splicing
- vectors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention provides a semantic similarity matching method, a semantic similarity matching device and a storage medium, wherein the method comprises the following steps: importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed; and constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix. The method solves the problems of feature loss, insufficient sentence interaction and network gradient disappearance, enriches the semantic features of the sentences, ensures that the information interaction between the sentences is more accurate and rich, and can capture semantic information of more sentence pairs.
Description
Technical Field
The invention mainly relates to the technical field of language processing, in particular to a semantic similarity matching method, a semantic similarity matching device and a semantic similarity matching storage medium.
Background
Text matching is an important research area in natural language processing. In the text matching task, the model takes two text sequences as input and predicts the semantic relationship between them. The method can be widely applied to various tasks, such as natural language reasoning, judgment of whether a hypothesis can be deduced from a forward sentence or not, determination of whether two sentences express the same meaning in paraphrase recognition, answer selection and the like, and can be regarded as a specific form of a text similarity matching problem, wherein the core problem of text matching is to model the correlation between the two sentences.
The most popular method for text matching today is the deep neural network, and the semantic similarity matching model based on the neural network receives wide attention due to its strong ability to learn sentence representation. At present, the sentence matching task mainly has two frameworks: a sentence-coding-based framework and an attention-based framework. For the first class of frameworks, a simple matching model is proposed by using two sentence semantic vectors, but semantic feature interaction between two sentences is omitted by directly using sentence semantic vector matching. For the second framework, an attention mechanism is introduced to model word-level interaction between two sentences, so that information features between the sentences are fused, and high accuracy is obtained. And the deep network model is superior to the shallow model, which shows that the deep network can learn more semantic features, but the gradient of the network is easy to disappear as the network is deepened, so that Yang et al propose the RE2 model, use the residual network connection to deepen the network depth, effectively solve the problem of gradient disappearance and improve the performance of the model. Although the RE2 model deepens the network depth by using residual connection, the RE2 model residual connection uses a summation mode, the output characteristics of each residual block are not tightly connected with the original characteristics, and the characteristic loss is easily caused. In addition, for information interaction among sentences, an attention mechanism is introduced for word-level interaction, so that the model only learns the similarity semantic features among sentence pairs, and does not capture semantic information of more sentence pairs, such as difference semantic features and key semantic features.
Disclosure of Invention
The invention aims to solve the technical problem of the prior art and provides a semantic similarity matching method, a semantic similarity matching device and a semantic similarity matching storage medium.
The technical scheme for solving the technical problems is as follows: a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
The invention has the beneficial effects that: the method comprises the steps of obtaining a first word vector matrix and a second word vector matrix through word vectorization processing of a first sample to be analyzed and a second sample to be analyzed respectively, obtaining the first fusion vector matrix and the second fusion vector matrix through feature fusion processing of the first word vector matrix and the second word vector matrix through a training network respectively, solving the problems of feature loss, insufficient interaction between sentences and network gradient disappearance, enriching semantic features of sentences, enabling information interaction between sentences to be more accurate and abundant, and capturing semantic information of more sentence pairs.
On the basis of the technical scheme, the invention can be further improved as follows.
Further, the training network comprises a plurality of densely connected networks, and the densely connected networks are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
and respectively inputting the first word vector matrix and the second word vector matrix into a first dense connection network for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until the last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.
The beneficial effect of adopting the further scheme is that: the first and second fusion vector matrixes are obtained by respectively performing feature fusion processing on the first and second word vector matrixes through a training network, so that semantic features of sentences are enriched, and the problems of feature loss, insufficient interaction among sentences and network gradient disappearance are solved.
Further, the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a convergence network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
and inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix.
The beneficial effect of adopting the further scheme is that: through the feature fusion processing of respectively inputting the first word vector matrix and the second word vector matrix into the first dense connection network, information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient sentence interaction and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Further, the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
and obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors.
The beneficial effect of adopting the further scheme is that: the first semantic matching vector matrix and the second semantic matching vector matrix are obtained by respectively aligning the first splicing vector matrix and the second splicing vector matrix through an alignment network, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient sentence interaction and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Further, the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dThe, an is a dot product operation,which means that the elements are multiplied by each other,are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
and obtaining a first fusion vector matrix according to the plurality of first fusion vectors.
The beneficial effect of adopting the further scheme is that: the fourth formula is used for respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors to obtain the plurality of first fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of insufficient interaction between characteristic loss and sentences and network gradient disappearance are solved, and more semantic information of sentence pairs can be captured.
Further, the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dThe, an is a dot product operation,which means that the elements are multiplied by each other,are all the ith second initial fusionVector, betaiIs the ith second fusion vector, [;]splicing operation for vectors;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
The beneficial effect of adopting the further scheme is that: the fifth formula is used for respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors to obtain the plurality of second fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of insufficient interaction between the characteristic loss and the sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Further, the process of performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a prediction value includes:
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:
where H is a multilayered feedforward neural network, v1Is a first matrix of commutation quantities, v2In order to be the second matrix of translation vectors,for the predicted values, the + -, -i. | and ° operations are all performed element by element [;]is a vector stitching operation.
The beneficial effect of adopting the further scheme is that: the predicted value is obtained through the sixth formula for the relevance prediction processing of the first conversion vector matrix and the second conversion vector matrix, the problems of feature loss, insufficient sentence interaction and disappearance of network gradient are solved, the semantic features of sentences are enriched, the information interaction between sentences is more accurate and abundant, and the semantic information of more sentence pairs can be captured.
Another technical solution of the present invention for solving the above technical problems is as follows: a semantic similarity matching apparatus comprising:
the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
the feature fusion processing module is used for constructing a training network, and performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network respectively to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and the matching result obtaining module is used for carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
Another technical solution of the present invention for solving the above technical problems is as follows: a semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor when executing the computer program implementing the semantic similarity matching method as described above.
Another technical solution of the present invention for solving the above technical problems is as follows: a computer-readable storage medium, storing a computer program which, when executed by a processor, implements the semantic similarity matching method as described above.
Drawings
Fig. 1 is a schematic flow chart of a semantic similarity matching method according to an embodiment of the present invention;
fig. 2 is a block diagram of a semantic similarity matching apparatus according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic flow chart of a semantic similarity matching method according to an embodiment of the present invention.
Example 1:
as shown in fig. 1, a semantic similarity matching method includes the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
Specifically, word vectorization processing is performed on the first sample to be analyzed and the second sample to be analyzed respectively by using an embedding word embedding algorithm, so that a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed are obtained.
It should be understood that the first fused vector matrix and the second fused vector matrix are respectively subjected to vector conversion through the pooling layer, and a first conversion vector matrix corresponding to the first fused vector matrix and a second conversion vector matrix corresponding to the second fused vector matrix are obtained.
In the embodiment, the first word vector matrix and the second word vector matrix are obtained by respectively carrying out word vectorization on the first sample to be analyzed and the second sample to be analyzed, and the first fusion vector matrix and the second fusion vector matrix are obtained by respectively carrying out feature fusion on the first word vector matrix and the second word vector matrix through the training network, so that the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, the semantic features of the sentences are enriched, the information interaction between the sentences is more accurate and enriched, and the semantic information of more sentence pairs can be captured.
Example 2:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
and respectively inputting the first word vector matrix and the second word vector matrix into a first dense connection network for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until the last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.
It should be understood that the features from the lowest layer to the highest layer are tightly connected by connecting using the residual network, i.e., the dense connection network, so that the semantic features of the sentences are enriched.
It should be understood that a plurality of said densely connected networks are connected in sequence, i.e. the output of the first densely connected network is connected to the input of the second densely connected network, the output of the second densely connected network is connected … to the input of the third densely connected network until the output of the nth densely connected network is connected to the input of the last densely connected network.
Specifically, the first dense connection network is used for respectively carrying out feature fusion processing on the first word vector matrix and the second word vector matrix to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix, and the first fused vector matrix is taken as the next first word vector matrix, and the second fused vector matrix is taken as the next second word vector matrix, and is respectively input into the next dense connection network, respectively performing feature fusion processing on the next first word vector matrix and the next second word vector matrix through the next dense connection network until the feature fusion processing is input into all the dense connection networks, and obtaining a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.
In particular, the present invention uses an enhanced residual network connection, representing the input and output of the nth residual network as respectivelyAndlet o(0)Representing a sequence of zero vectors, x(1)Is the input vector, x, of the first residual network(1)Is a word vector matrix. The nth residual network x(n)The input of (n ≧ 2) is the first residual network x(1)The input of (a) and the output of the first two residual networks are spliced, the formula is as follows:
wherein [; is the vector stitching operation.
In the above embodiment, the first fused vector matrix and the second fused vector matrix are obtained by respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network, so that semantic features of sentences are enriched, and the problems of feature loss, insufficient interaction between sentences and disappearance of network gradients are solved.
Example 3:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
and inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix.
In particular, one of the models that the alignment network was originally modeling sentence pairs based on the attention mechanism achieved up-to-date results on the order of magnitude of the SNLI dataset compared to the other models, and did not rely on word order information.
In the embodiment, the first word vector matrix and the second word vector matrix are respectively input into the first dense connection network for feature fusion processing, so that information interaction between sentences is more accurate and richer, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 4:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
wherein e isijFor the ith first spellingSimilarity vector, p, of the patch vector and the jth second patch vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second stitching vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vector by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
wherein e isijIs the similarity vector of the ith first stitching vector and the jth second stitching vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
and obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors.
It should be appreciated that the attention mechanism is employed to interact information (i.e., soft-align) between the two sentences.
Specifically, the lengths of the first splicing vector matrix q and the second splicing vector matrix p are respectively set to lqAnd lpThen the input of the two concatenated vector matrices can be represented asAnd p ═ p (p)1,p2,…,plp)。qiAnd pjSimilarity vector e betweenijThe calculation is the dot product of the projection vector, and the calculation formula is as follows:
eij=F(qi)TF(pj),
calculating the semantic matching vector of each splicing vector through the soft alignment of sentence elements, wherein the formula is as follows:
in the above embodiment, the alignment network is used to align the first and second spliced vector matrices to obtain the first and second semantic matching vector matrices, so that information interaction between sentences is more accurate and rich, problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 5:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiIs the ith first splicing vector, pj is the jth second splicing vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dThe, an is a dot product operation,which means that the elements are multiplied by each other,are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
and obtaining a first fusion vector matrix according to the plurality of first fusion vectors.
Specifically, the fusion layer compares local and aligned representations from multiple angles, first the various fusion calculation formulas:
gi=sigmoid(q′i)
wherein G is1,G2,G3,G4,G5And G is a single layer feed forward network with independent parameters,the expression elements are multiplied, the difference of sentence pairs is highlighted through subtraction, and the similarity of the sentence pairs is compared through multiplication and cosine calculation. sigmoid is an activation function, u is a dot product operation, gi∈(0,1)dAnd the gate is selected for calculation, so that the key characteristics of the sentence are highlighted.
In the embodiment, the plurality of first fusion vectors are obtained by respectively performing fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors according to the fourth formula, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 6:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dThe instead of the simple operation is the dot-product operation,which means that the elements are multiplied by each other,are all the ith second initial fusion vector, betaiIs the ith second fusion vector, [;]splicing operation for vectors;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
In the embodiment, the fifth formula is used for respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors to obtain the plurality of second fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 7:
a semantic similarity matching method is characterized by comprising the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the process of performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value comprises the following steps:
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:
wherein H is a plurality of layersV. of a feedforward neural network1Is a first matrix of commutation quantities, v2In order to be the second matrix of translation vectors,for the predicted values, the + -, -i. | and ° operations are all performed element by element [;]is a vector stitching operation.
In particular, the prediction layer outputs two sequences from the pooling layer out of the first matrix of switching metrics v1And said second matrix of translation vectors v2As input, a predicted value is obtained, and the formula is:
wherein H is a multi-layer feedforward neural network. In the classification task, it is possible to classify the objects,all unnormalized scores are indicated, and C is the classification number. In the regression task, the regression task is carried out,is the predicted Scala value, i.e. the predicted value.
In the paraphrase recognition task, the prediction layer is:
a simplified version of a prediction layer network is also provided, the formula being:
in the prediction layer, +, -, | andthe operation is performed element by element to infer a relationship between two sentences.
In the embodiment, the predicted value is obtained by performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix according to the sixth formula, so that the problems of feature loss, insufficient sentence interaction and disappearance of network gradient are solved, the semantic features of sentences are enriched, the information interaction between sentences is more accurate and abundant, and the semantic information of more sentence pairs can be captured.
Fig. 2 is a block diagram of a semantic similarity matching apparatus according to an embodiment of the present invention.
Example 8:
as shown in fig. 2, a semantic similarity matching apparatus includes:
the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
the feature fusion processing module is used for constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and the matching result obtaining module is used for carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
Example 9:
a semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor when executing the computer program implementing the semantic similarity matching method as described above. The device may be a computer or the like.
Example 10:
a computer-readable storage medium, storing a computer program which, when executed by a processor, implements the semantic similarity matching method as described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. It will be understood that the technical solution of the present invention essentially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (5)
1. A semantic similarity matching method is characterized by comprising the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dAs a dot product operation, ° denotes element multiplication,are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
obtaining a first fusion vector matrix according to the plurality of first fusion vectors;
the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dAs a dot product operation, ° denotes element multiplication,are all the ith second initial fusion vector, betaiIs the ith second fusion vector, [;]splicing operation for vectors;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
2. The semantic similarity matching method according to claim 1, wherein the predicting the correlation between the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value comprises:
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:
3. A semantic similarity matching apparatus, comprising:
the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
the feature fusion processing module is used for constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
a matching result obtaining module, configured to perform relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a prediction value, and use the prediction value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
in the feature fusion processing module, the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
in the feature fusion processing module, the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
in the feature fusion processing module, the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix includes:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijSimilarity vectors of the ith first splicing vector and the jth second splicing vector are obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
in the feature fusion processing module, the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dThe instead of the simple operation is the dot-product operation,which means that the elements are multiplied by each other,are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
obtaining a first fusion vector matrix according to the plurality of first fusion vectors;
in the feature fusion processing module, the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dAs a dot product operation, ° denotes element multiplication,are all the ith second initial fusion vector, betaiIs the ith second fusion vector, [;]splicing operation for the vector;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
4. A semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that when the computer program is executed by the processor, the semantic similarity matching method according to any one of claims 1 to 2 is implemented.
5. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the semantic similarity matching method according to any one of claims 1 to 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011579327.0A CN112560502B (en) | 2020-12-28 | 2020-12-28 | Semantic similarity matching method and device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011579327.0A CN112560502B (en) | 2020-12-28 | 2020-12-28 | Semantic similarity matching method and device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112560502A CN112560502A (en) | 2021-03-26 |
CN112560502B true CN112560502B (en) | 2022-05-13 |
Family
ID=75033914
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011579327.0A Active CN112560502B (en) | 2020-12-28 | 2020-12-28 | Semantic similarity matching method and device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112560502B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158624B (en) * | 2021-04-09 | 2023-12-08 | 中国人民解放军国防科技大学 | Method and system for fine tuning pre-training language model by fusing language information in event extraction |
CN113656066B (en) * | 2021-08-16 | 2022-08-05 | 南京航空航天大学 | Clone code detection method based on feature alignment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110348014A (en) * | 2019-07-10 | 2019-10-18 | 电子科技大学 | A kind of semantic similarity calculation method based on deep learning |
CN110765755A (en) * | 2019-10-28 | 2020-02-07 | 桂林电子科技大学 | Semantic similarity feature extraction method based on double selection gates |
CN110826338A (en) * | 2019-10-28 | 2020-02-21 | 桂林电子科技大学 | Fine-grained semantic similarity recognition method for single-choice gate and inter-class measurement |
CN110930219A (en) * | 2019-11-14 | 2020-03-27 | 电子科技大学 | Personalized merchant recommendation method based on multi-feature fusion |
CN111241295A (en) * | 2020-01-03 | 2020-06-05 | 浙江大学 | Knowledge map relation data extraction method based on semantic syntax interactive network |
CN111310438A (en) * | 2020-02-20 | 2020-06-19 | 齐鲁工业大学 | Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10388272B1 (en) * | 2018-12-04 | 2019-08-20 | Sorenson Ip Holdings, Llc | Training speech recognition systems using word sequences |
-
2020
- 2020-12-28 CN CN202011579327.0A patent/CN112560502B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110348014A (en) * | 2019-07-10 | 2019-10-18 | 电子科技大学 | A kind of semantic similarity calculation method based on deep learning |
CN110765755A (en) * | 2019-10-28 | 2020-02-07 | 桂林电子科技大学 | Semantic similarity feature extraction method based on double selection gates |
CN110826338A (en) * | 2019-10-28 | 2020-02-21 | 桂林电子科技大学 | Fine-grained semantic similarity recognition method for single-choice gate and inter-class measurement |
CN110930219A (en) * | 2019-11-14 | 2020-03-27 | 电子科技大学 | Personalized merchant recommendation method based on multi-feature fusion |
CN111241295A (en) * | 2020-01-03 | 2020-06-05 | 浙江大学 | Knowledge map relation data extraction method based on semantic syntax interactive network |
CN111310438A (en) * | 2020-02-20 | 2020-06-19 | 齐鲁工业大学 | Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model |
Non-Patent Citations (3)
Title |
---|
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information;Seonhoon Kim 等;《https://arxiv.org/pdf/1085.11360.pdf》;20181102;1-11 * |
Simple and Effective Text Matching with Richer Alignment Features;Runqi Yang 等;《roceedings of the 57th Annual Meeting of the Association for Computational Linguistics》;20190802;4699-4709 * |
基于密集连接网络和多维特征融合的文本匹配模型;陈岳林 等;《浙 江 大 学 学 报 (工学版)》;20211231;2352-2358 * |
Also Published As
Publication number | Publication date |
---|---|
CN112560502A (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109947912B (en) | Model method based on intra-paragraph reasoning and joint question answer matching | |
CN108334487B (en) | Missing semantic information completion method and device, computer equipment and storage medium | |
Rastogi et al. | Scalable multi-domain dialogue state tracking | |
CN111368565B (en) | Text translation method, text translation device, storage medium and computer equipment | |
CN108052512B (en) | Image description generation method based on depth attention mechanism | |
CN109783817B (en) | Text semantic similarity calculation model based on deep reinforcement learning | |
CN109661664B (en) | Information processing method and related device | |
CN111831789B (en) | Question-answering text matching method based on multi-layer semantic feature extraction structure | |
CN110781306B (en) | English text aspect layer emotion classification method and system | |
Zhang et al. | Exploring question understanding and adaptation in neural-network-based question answering | |
CN112131366A (en) | Method, device and storage medium for training text classification model and text classification | |
CN111581973A (en) | Entity disambiguation method and system | |
CN110457718B (en) | Text generation method and device, computer equipment and storage medium | |
CN109740158B (en) | Text semantic parsing method and device | |
CN112560502B (en) | Semantic similarity matching method and device and storage medium | |
CN110232122A (en) | A kind of Chinese Question Classification method based on text error correction and neural network | |
CN111191002A (en) | Neural code searching method and device based on hierarchical embedding | |
CN109582970B (en) | Semantic measurement method, semantic measurement device, semantic measurement equipment and readable storage medium | |
CN112183085A (en) | Machine reading understanding method and device, electronic equipment and computer storage medium | |
WO2019220113A1 (en) | Device and method for natural language processing | |
WO2022134793A1 (en) | Method and apparatus for extracting semantic information in video frame, and computer device | |
CN116204674B (en) | Image description method based on visual concept word association structural modeling | |
CN113743119A (en) | Chinese named entity recognition module, method and device and electronic equipment | |
JP2017010249A (en) | Parameter learning device, sentence similarity calculation device, method, and program | |
CN114417823A (en) | Aspect level emotion analysis method and device based on syntax and graph convolution network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |