CN112560502B

CN112560502B - Semantic similarity matching method and device and storage medium

Info

Publication number: CN112560502B
Application number: CN202011579327.0A
Authority: CN
Inventors: 蔡晓东; 田文靖
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2022-05-13
Anticipated expiration: 2040-12-28
Also published as: CN112560502A

Abstract

The invention provides a semantic similarity matching method, a semantic similarity matching device and a storage medium, wherein the method comprises the following steps: importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed; and constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix. The method solves the problems of feature loss, insufficient sentence interaction and network gradient disappearance, enriches the semantic features of the sentences, ensures that the information interaction between the sentences is more accurate and rich, and can capture semantic information of more sentence pairs.

Description

Semantic similarity matching method and device and storage medium

Technical Field

The invention mainly relates to the technical field of language processing, in particular to a semantic similarity matching method, a semantic similarity matching device and a semantic similarity matching storage medium.

Background

Text matching is an important research area in natural language processing. In the text matching task, the model takes two text sequences as input and predicts the semantic relationship between them. The method can be widely applied to various tasks, such as natural language reasoning, judgment of whether a hypothesis can be deduced from a forward sentence or not, determination of whether two sentences express the same meaning in paraphrase recognition, answer selection and the like, and can be regarded as a specific form of a text similarity matching problem, wherein the core problem of text matching is to model the correlation between the two sentences.

The most popular method for text matching today is the deep neural network, and the semantic similarity matching model based on the neural network receives wide attention due to its strong ability to learn sentence representation. At present, the sentence matching task mainly has two frameworks: a sentence-coding-based framework and an attention-based framework. For the first class of frameworks, a simple matching model is proposed by using two sentence semantic vectors, but semantic feature interaction between two sentences is omitted by directly using sentence semantic vector matching. For the second framework, an attention mechanism is introduced to model word-level interaction between two sentences, so that information features between the sentences are fused, and high accuracy is obtained. And the deep network model is superior to the shallow model, which shows that the deep network can learn more semantic features, but the gradient of the network is easy to disappear as the network is deepened, so that Yang et al propose the RE2 model, use the residual network connection to deepen the network depth, effectively solve the problem of gradient disappearance and improve the performance of the model. Although the RE2 model deepens the network depth by using residual connection, the RE2 model residual connection uses a summation mode, the output characteristics of each residual block are not tightly connected with the original characteristics, and the characteristic loss is easily caused. In addition, for information interaction among sentences, an attention mechanism is introduced for word-level interaction, so that the model only learns the similarity semantic features among sentence pairs, and does not capture semantic information of more sentence pairs, such as difference semantic features and key semantic features.

Disclosure of Invention

The invention aims to solve the technical problem of the prior art and provides a semantic similarity matching method, a semantic similarity matching device and a semantic similarity matching storage medium.

The technical scheme for solving the technical problems is as follows: a semantic similarity matching method comprises the following steps:

importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;

constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;

respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;

and carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.

The invention has the beneficial effects that: the method comprises the steps of obtaining a first word vector matrix and a second word vector matrix through word vectorization processing of a first sample to be analyzed and a second sample to be analyzed respectively, obtaining the first fusion vector matrix and the second fusion vector matrix through feature fusion processing of the first word vector matrix and the second word vector matrix through a training network respectively, solving the problems of feature loss, insufficient interaction between sentences and network gradient disappearance, enriching semantic features of sentences, enabling information interaction between sentences to be more accurate and abundant, and capturing semantic information of more sentence pairs.

On the basis of the technical scheme, the invention can be further improved as follows.

Further, the training network comprises a plurality of densely connected networks, and the densely connected networks are connected in sequence;

the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:

and respectively inputting the first word vector matrix and the second word vector matrix into a first dense connection network for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until the last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.

The beneficial effect of adopting the further scheme is that: the first and second fusion vector matrixes are obtained by respectively performing feature fusion processing on the first and second word vector matrixes through a training network, so that semantic features of sentences are enriched, and the problems of feature loss, insufficient interaction among sentences and network gradient disappearance are solved.

Further, the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a convergence network;

the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:

inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;

splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;

splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;

respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;

inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;

and inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix.

The beneficial effect of adopting the further scheme is that: through the feature fusion processing of respectively inputting the first word vector matrix and the second word vector matrix into the first dense connection network, information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient sentence interaction and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Further, the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,

the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:

respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:

e_ij＝F(q_i)^TF(p_j)，

wherein F () is a single layer feedforward network, q_iFor the ith first stitching vector, p_jIs the jth second stitching vector, T is the transpose, e_ijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;

respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:

wherein e is_ijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vector_jIs the jth second stitching vector, q'_iFor the first semantic matching vector, l_pIs the second concatenation vector matrix length;

respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:

wherein e is_ijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, q_iFor the ith first stitching vector, l_qIs the first concatenation vector matrix length, p'_iMatching the vector for the second semantic;

and obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors.

The beneficial effect of adopting the further scheme is that: the first semantic matching vector matrix and the second semantic matching vector matrix are obtained by respectively aligning the first splicing vector matrix and the second splicing vector matrix through an alignment network, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient sentence interaction and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Further, the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix includes:

respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:

wherein the content of the first and second substances,

wherein, g_i＝sigmoid(q′_i)，

Wherein, q'_iFor the first semantic matching vector, q_iFor the ith first stitching vector, G, G₁，G₂、G₃、G₄And G₅Are all single-layer feedforward networks, sigmoid is an activation function, g_i∈(0,1)^dThe, an is a dot product operation,

which means that the elements are multiplied by each other,

are all the ith first initial fusion vector, α_iIs the ith first fusion vector, [;]splicing operation for vectors;

and obtaining a first fusion vector matrix according to the plurality of first fusion vectors.

The beneficial effect of adopting the further scheme is that: the fourth formula is used for respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors to obtain the plurality of first fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of insufficient interaction between characteristic loss and sentences and network gradient disappearance are solved, and more semantic information of sentence pairs can be captured.

Further, the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix includes:

respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:

wherein the content of the first and second substances,

wherein, g'_i＝sigmoid(p′_i)，

Wherein, p'_iFor the second semantic matching vector, p_iIs the ith second stitching vector, G'₁，G′₂、G′₃、G′₄And G'₅Are single-layer feedforward networks, and sigmoid is an activation function, g'_i∈(0,1)^dThe, an is a dot product operation,

which means that the elements are multiplied by each other,

are all the ith second initial fusionVector, beta_iIs the ith second fusion vector, [;]splicing operation for vectors;

and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.

The beneficial effect of adopting the further scheme is that: the fifth formula is used for respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors to obtain the plurality of second fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of insufficient interaction between the characteristic loss and the sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Further, the process of performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a prediction value includes:

performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:

where H is a multilayered feedforward neural network, v₁Is a first matrix of commutation quantities, v₂In order to be the second matrix of translation vectors,

for the predicted values, the + -, -i. | and ° operations are all performed element by element [;]is a vector stitching operation.

The beneficial effect of adopting the further scheme is that: the predicted value is obtained through the sixth formula for the relevance prediction processing of the first conversion vector matrix and the second conversion vector matrix, the problems of feature loss, insufficient sentence interaction and disappearance of network gradient are solved, the semantic features of sentences are enriched, the information interaction between sentences is more accurate and abundant, and the semantic information of more sentence pairs can be captured.

Another technical solution of the present invention for solving the above technical problems is as follows: a semantic similarity matching apparatus comprising:

the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;

the feature fusion processing module is used for constructing a training network, and performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network respectively to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;

the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;

and the matching result obtaining module is used for carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.

Another technical solution of the present invention for solving the above technical problems is as follows: a semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor when executing the computer program implementing the semantic similarity matching method as described above.

Another technical solution of the present invention for solving the above technical problems is as follows: a computer-readable storage medium, storing a computer program which, when executed by a processor, implements the semantic similarity matching method as described above.

Drawings

Fig. 1 is a schematic flow chart of a semantic similarity matching method according to an embodiment of the present invention;

fig. 2 is a block diagram of a semantic similarity matching apparatus according to an embodiment of the present invention.

Detailed Description

The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.

Fig. 1 is a schematic flow chart of a semantic similarity matching method according to an embodiment of the present invention.

Example 1:

as shown in fig. 1, a semantic similarity matching method includes the following steps:

Specifically, word vectorization processing is performed on the first sample to be analyzed and the second sample to be analyzed respectively by using an embedding word embedding algorithm, so that a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed are obtained.

It should be understood that the first fused vector matrix and the second fused vector matrix are respectively subjected to vector conversion through the pooling layer, and a first conversion vector matrix corresponding to the first fused vector matrix and a second conversion vector matrix corresponding to the second fused vector matrix are obtained.

In the embodiment, the first word vector matrix and the second word vector matrix are obtained by respectively carrying out word vectorization on the first sample to be analyzed and the second sample to be analyzed, and the first fusion vector matrix and the second fusion vector matrix are obtained by respectively carrying out feature fusion on the first word vector matrix and the second word vector matrix through the training network, so that the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, the semantic features of the sentences are enriched, the information interaction between the sentences is more accurate and enriched, and the semantic information of more sentence pairs can be captured.

Example 2:

a semantic similarity matching method comprises the following steps:

performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;

the training network comprises a plurality of densely connected networks which are connected in sequence;

It should be understood that the features from the lowest layer to the highest layer are tightly connected by connecting using the residual network, i.e., the dense connection network, so that the semantic features of the sentences are enriched.

It should be understood that a plurality of said densely connected networks are connected in sequence, i.e. the output of the first densely connected network is connected to the input of the second densely connected network, the output of the second densely connected network is connected … to the input of the third densely connected network until the output of the nth densely connected network is connected to the input of the last densely connected network.

Specifically, the first dense connection network is used for respectively carrying out feature fusion processing on the first word vector matrix and the second word vector matrix to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix, and the first fused vector matrix is taken as the next first word vector matrix, and the second fused vector matrix is taken as the next second word vector matrix, and is respectively input into the next dense connection network, respectively performing feature fusion processing on the next first word vector matrix and the next second word vector matrix through the next dense connection network until the feature fusion processing is input into all the dense connection networks, and obtaining a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.

In particular, the present invention uses an enhanced residual network connection, representing the input and output of the nth residual network as respectively

And

let o⁽⁰⁾Representing a sequence of zero vectors, x⁽¹⁾Is the input vector, x, of the first residual network⁽¹⁾Is a word vector matrix. The nth residual network x⁽ⁿ⁾The input of (n ≧ 2) is the first residual network x⁽¹⁾The input of (a) and the output of the first two residual networks are spliced, the formula is as follows:

wherein [; is the vector stitching operation.

In the above embodiment, the first fused vector matrix and the second fused vector matrix are obtained by respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network, so that semantic features of sentences are enriched, and the problems of feature loss, insufficient interaction between sentences and disappearance of network gradients are solved.

Example 3:

a semantic similarity matching method comprises the following steps:

inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;

the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;

In particular, one of the models that the alignment network was originally modeling sentence pairs based on the attention mechanism achieved up-to-date results on the order of magnitude of the SNLI dataset compared to the other models, and did not rely on word order information.

In the embodiment, the first word vector matrix and the second word vector matrix are respectively input into the first dense connection network for feature fusion processing, so that information interaction between sentences is more accurate and richer, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Example 4:

a semantic similarity matching method comprises the following steps:

inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;

inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;

the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,

e_ij＝F(q_i)^TF(p_j)，

wherein e is_ijFor the ith first spellingSimilarity vector, p, of the patch vector and the jth second patch vector_jIs the jth second stitching vector, q'_iFor the first semantic matching vector, l_pIs the second stitching vector matrix length;

respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vector by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:

wherein e is_ijIs the similarity vector of the ith first stitching vector and the jth second stitching vector, q_iFor the ith first stitching vector, l_qIs the first concatenation vector matrix length, p'_iMatching the vector for the second semantic;

It should be appreciated that the attention mechanism is employed to interact information (i.e., soft-align) between the two sentences.

Specifically, the lengths of the first splicing vector matrix q and the second splicing vector matrix p are respectively set to l_qAnd l_pThen the input of the two concatenated vector matrices can be represented as

And p ═ p (p)₁,p₂,…,p_lp)。q_iAnd p_jSimilarity vector e between_ijThe calculation is the dot product of the projection vector, and the calculation formula is as follows:

e_ij＝F(q_i)^TF(p_j)，

calculating the semantic matching vector of each splicing vector through the soft alignment of sentence elements, wherein the formula is as follows:

in the above embodiment, the alignment network is used to align the first and second spliced vector matrices to obtain the first and second semantic matching vector matrices, so that information interaction between sentences is more accurate and rich, problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Example 5:

a semantic similarity matching method comprises the following steps:

e_ij＝F(q_i)^TF(p_j)，

wherein F () is a single layer feedforward network, q_iIs the ith first splicing vector, pj is the jth second splicing vector, T is the transpose, e_ijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;

obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;

the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix comprises the following steps:

wherein the content of the first and second substances,

wherein, g_i＝sigmoid(q′_i)，

which means that the elements are multiplied by each other,

Specifically, the fusion layer compares local and aligned representations from multiple angles, first the various fusion calculation formulas:

g_i＝sigmoid(q′_i)

wherein G is₁，G₂，G₃，G₄，G₅And G is a single layer feed forward network with independent parameters,

the expression elements are multiplied, the difference of sentence pairs is highlighted through subtraction, and the similarity of the sentence pairs is compared through multiplication and cosine calculation. sigmoid is an activation function, u is a dot product operation, g_i∈(0,1)^dAnd the gate is selected for calculation, so that the key characteristics of the sentence are highlighted.

In the embodiment, the plurality of first fusion vectors are obtained by respectively performing fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors according to the fourth formula, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Example 6:

a semantic similarity matching method comprises the following steps:

e_ij＝F(q_i)^TF(p_j)，

the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix comprises the following steps:

wherein, the first and the second end of the pipe are connected with each other,

wherein, g'_i＝sigmoid(p′_i)，

Wherein, p'_iFor the second semantic matching vector, p_iIs the ith second stitching vector, G'₁，G′₂、G′₃、G′₄And G'₅Are single-layer feedforward networks, and sigmoid is an activation function, g'_i∈(0,1)^dThe instead of the simple operation is the dot-product operation,

which means that the elements are multiplied by each other,

are all the ith second initial fusion vector, beta_iIs the ith second fusion vector, [;]splicing operation for vectors;

In the embodiment, the fifth formula is used for respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors to obtain the plurality of second fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.

Example 7:

a semantic similarity matching method is characterized by comprising the following steps:

the process of performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value comprises the following steps:

wherein H is a plurality of layersV. of a feedforward neural network₁Is a first matrix of commutation quantities, v₂In order to be the second matrix of translation vectors,

In particular, the prediction layer outputs two sequences from the pooling layer out of the first matrix of switching metrics v₁And said second matrix of translation vectors v₂As input, a predicted value is obtained, and the formula is:

wherein H is a multi-layer feedforward neural network. In the classification task, it is possible to classify the objects,

all unnormalized scores are indicated, and C is the classification number. In the regression task, the regression task is carried out,

is the predicted Scala value, i.e. the predicted value.

In the paraphrase recognition task, the prediction layer is:

a simplified version of a prediction layer network is also provided, the formula being:

in the prediction layer, +, -, | and

the operation is performed element by element to infer a relationship between two sentences.

In the embodiment, the predicted value is obtained by performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix according to the sixth formula, so that the problems of feature loss, insufficient sentence interaction and disappearance of network gradient are solved, the semantic features of sentences are enriched, the information interaction between sentences is more accurate and abundant, and the semantic information of more sentence pairs can be captured.

Example 8:

as shown in fig. 2, a semantic similarity matching apparatus includes:

the feature fusion processing module is used for constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;

Example 9:

a semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor when executing the computer program implementing the semantic similarity matching method as described above. The device may be a computer or the like.

Example 10:

a computer-readable storage medium, storing a computer program which, when executed by a processor, implements the semantic similarity matching method as described above.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. It will be understood that the technical solution of the present invention essentially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A semantic similarity matching method is characterized by comprising the following steps:

e_ij＝F(q_i)^TF(p_j)，

wherein the content of the first and second substances,

wherein, g_i＝sigmoid(q′_i)，

Wherein, q'_iFor the first semantic matching vector, q_iFor the ith first stitching vector, G, G₁，G₂、G₃、G₄And G₅Are all single-layer feedforward networks, sigmoid is an activation function, g_i∈(0,1)^dAs a dot product operation, ° denotes element multiplication,

obtaining a first fusion vector matrix according to the plurality of first fusion vectors;

wherein the content of the first and second substances,

wherein, g'_i＝sigmoid(p′_i)，

Wherein, p'_iFor the second semantic matching vector, p_iIs the ith second stitching vector, G'₁，G′₂、G′₃、G′₄And G'₅Are single-layer feedforward networks, and sigmoid is an activation function, g'_i∈(0,1)^dAs a dot product operation, ° denotes element multiplication,

2. The semantic similarity matching method according to claim 1, wherein the predicting the correlation between the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value comprises:

where H is a multilayered feedforward neural network, v₁For the first rotationQuantity matrix, v₂In order to be the second matrix of translation vectors,

is the sum of the predicted values, +, -, |

The operations are performed element by element [;]is a vector stitching operation.

3. A semantic similarity matching apparatus, comprising:

a matching result obtaining module, configured to perform relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a prediction value, and use the prediction value as a matching result;

in the feature fusion processing module, the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:

in the feature fusion processing module, the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network for feature fusion processing includes:

in the feature fusion processing module, the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix includes:

e_ij＝F(q_i)^TF(p_j)，

wherein F () is a single layer feedforward network, q_iFor the ith first stitching vector, p_jIs the jth second stitching vector, T is the transpose, e_ijSimilarity vectors of the ith first splicing vector and the jth second splicing vector are obtained;

in the feature fusion processing module, the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix includes:

wherein the content of the first and second substances,

wherein, g_i＝sigmoid(q′_i)，

Wherein, q'_iFor the first semantic matching vector, q_iFor the ith first stitching vector, G, G₁，G₂、G₃、G₄And G₅Are all single-layer feedforward networks, sigmoid is an activation function, g_i∈(0,1)^dThe instead of the simple operation is the dot-product operation,

which means that the elements are multiplied by each other,

in the feature fusion processing module, the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix includes:

wherein the content of the first and second substances,

wherein, g'_i＝sigmoid(p′_i)，

are all the ith second initial fusion vector, beta_iIs the ith second fusion vector, [;]splicing operation for the vector;

4. A semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that when the computer program is executed by the processor, the semantic similarity matching method according to any one of claims 1 to 2 is implemented.

5. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the semantic similarity matching method according to any one of claims 1 to 2.