CN112560502B - Semantic similarity matching method and device and storage medium - Google Patents

Semantic similarity matching method and device and storage medium Download PDF

Info

Publication number
CN112560502B
CN112560502B CN202011579327.0A CN202011579327A CN112560502B CN 112560502 B CN112560502 B CN 112560502B CN 202011579327 A CN202011579327 A CN 202011579327A CN 112560502 B CN112560502 B CN 112560502B
Authority
CN
China
Prior art keywords
vector matrix
fusion
vector
splicing
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011579327.0A
Other languages
Chinese (zh)
Other versions
CN112560502A (en
Inventor
蔡晓东
田文靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202011579327.0A priority Critical patent/CN112560502B/en
Publication of CN112560502A publication Critical patent/CN112560502A/en
Application granted granted Critical
Publication of CN112560502B publication Critical patent/CN112560502B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention provides a semantic similarity matching method, a semantic similarity matching device and a storage medium, wherein the method comprises the following steps: importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed; and constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix. The method solves the problems of feature loss, insufficient sentence interaction and network gradient disappearance, enriches the semantic features of the sentences, ensures that the information interaction between the sentences is more accurate and rich, and can capture semantic information of more sentence pairs.

Description

Semantic similarity matching method and device and storage medium
Technical Field
The invention mainly relates to the technical field of language processing, in particular to a semantic similarity matching method, a semantic similarity matching device and a semantic similarity matching storage medium.
Background
Text matching is an important research area in natural language processing. In the text matching task, the model takes two text sequences as input and predicts the semantic relationship between them. The method can be widely applied to various tasks, such as natural language reasoning, judgment of whether a hypothesis can be deduced from a forward sentence or not, determination of whether two sentences express the same meaning in paraphrase recognition, answer selection and the like, and can be regarded as a specific form of a text similarity matching problem, wherein the core problem of text matching is to model the correlation between the two sentences.
The most popular method for text matching today is the deep neural network, and the semantic similarity matching model based on the neural network receives wide attention due to its strong ability to learn sentence representation. At present, the sentence matching task mainly has two frameworks: a sentence-coding-based framework and an attention-based framework. For the first class of frameworks, a simple matching model is proposed by using two sentence semantic vectors, but semantic feature interaction between two sentences is omitted by directly using sentence semantic vector matching. For the second framework, an attention mechanism is introduced to model word-level interaction between two sentences, so that information features between the sentences are fused, and high accuracy is obtained. And the deep network model is superior to the shallow model, which shows that the deep network can learn more semantic features, but the gradient of the network is easy to disappear as the network is deepened, so that Yang et al propose the RE2 model, use the residual network connection to deepen the network depth, effectively solve the problem of gradient disappearance and improve the performance of the model. Although the RE2 model deepens the network depth by using residual connection, the RE2 model residual connection uses a summation mode, the output characteristics of each residual block are not tightly connected with the original characteristics, and the characteristic loss is easily caused. In addition, for information interaction among sentences, an attention mechanism is introduced for word-level interaction, so that the model only learns the similarity semantic features among sentence pairs, and does not capture semantic information of more sentence pairs, such as difference semantic features and key semantic features.
Disclosure of Invention
The invention aims to solve the technical problem of the prior art and provides a semantic similarity matching method, a semantic similarity matching device and a semantic similarity matching storage medium.
The technical scheme for solving the technical problems is as follows: a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
The invention has the beneficial effects that: the method comprises the steps of obtaining a first word vector matrix and a second word vector matrix through word vectorization processing of a first sample to be analyzed and a second sample to be analyzed respectively, obtaining the first fusion vector matrix and the second fusion vector matrix through feature fusion processing of the first word vector matrix and the second word vector matrix through a training network respectively, solving the problems of feature loss, insufficient interaction between sentences and network gradient disappearance, enriching semantic features of sentences, enabling information interaction between sentences to be more accurate and abundant, and capturing semantic information of more sentence pairs.
On the basis of the technical scheme, the invention can be further improved as follows.
Further, the training network comprises a plurality of densely connected networks, and the densely connected networks are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
and respectively inputting the first word vector matrix and the second word vector matrix into a first dense connection network for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until the last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.
The beneficial effect of adopting the further scheme is that: the first and second fusion vector matrixes are obtained by respectively performing feature fusion processing on the first and second word vector matrixes through a training network, so that semantic features of sentences are enriched, and the problems of feature loss, insufficient interaction among sentences and network gradient disappearance are solved.
Further, the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a convergence network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
and inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix.
The beneficial effect of adopting the further scheme is that: through the feature fusion processing of respectively inputting the first word vector matrix and the second word vector matrix into the first dense connection network, information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient sentence interaction and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Further, the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
Figure BDA0002865496130000051
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
Figure BDA0002865496130000052
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
and obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors.
The beneficial effect of adopting the further scheme is that: the first semantic matching vector matrix and the second semantic matching vector matrix are obtained by respectively aligning the first splicing vector matrix and the second splicing vector matrix through an alignment network, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient sentence interaction and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Further, the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
Figure BDA0002865496130000061
wherein the content of the first and second substances,
Figure BDA0002865496130000062
Figure BDA0002865496130000063
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dThe, an is a dot product operation,
Figure BDA0002865496130000068
which means that the elements are multiplied by each other,
Figure BDA0002865496130000064
are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
and obtaining a first fusion vector matrix according to the plurality of first fusion vectors.
The beneficial effect of adopting the further scheme is that: the fourth formula is used for respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors to obtain the plurality of first fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of insufficient interaction between characteristic loss and sentences and network gradient disappearance are solved, and more semantic information of sentence pairs can be captured.
Further, the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
Figure BDA0002865496130000065
wherein the content of the first and second substances,
Figure BDA0002865496130000066
Figure BDA0002865496130000067
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dThe, an is a dot product operation,
Figure BDA0002865496130000074
which means that the elements are multiplied by each other,
Figure BDA0002865496130000071
are all the ith second initial fusionVector, betaiIs the ith second fusion vector, [;]splicing operation for vectors;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
The beneficial effect of adopting the further scheme is that: the fifth formula is used for respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors to obtain the plurality of second fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of insufficient interaction between the characteristic loss and the sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Further, the process of performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a prediction value includes:
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:
Figure BDA0002865496130000072
where H is a multilayered feedforward neural network, v1Is a first matrix of commutation quantities, v2In order to be the second matrix of translation vectors,
Figure BDA0002865496130000073
for the predicted values, the + -, -i. | and ° operations are all performed element by element [;]is a vector stitching operation.
The beneficial effect of adopting the further scheme is that: the predicted value is obtained through the sixth formula for the relevance prediction processing of the first conversion vector matrix and the second conversion vector matrix, the problems of feature loss, insufficient sentence interaction and disappearance of network gradient are solved, the semantic features of sentences are enriched, the information interaction between sentences is more accurate and abundant, and the semantic information of more sentence pairs can be captured.
Another technical solution of the present invention for solving the above technical problems is as follows: a semantic similarity matching apparatus comprising:
the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
the feature fusion processing module is used for constructing a training network, and performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network respectively to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and the matching result obtaining module is used for carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
Another technical solution of the present invention for solving the above technical problems is as follows: a semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor when executing the computer program implementing the semantic similarity matching method as described above.
Another technical solution of the present invention for solving the above technical problems is as follows: a computer-readable storage medium, storing a computer program which, when executed by a processor, implements the semantic similarity matching method as described above.
Drawings
Fig. 1 is a schematic flow chart of a semantic similarity matching method according to an embodiment of the present invention;
fig. 2 is a block diagram of a semantic similarity matching apparatus according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic flow chart of a semantic similarity matching method according to an embodiment of the present invention.
Example 1:
as shown in fig. 1, a semantic similarity matching method includes the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
Specifically, word vectorization processing is performed on the first sample to be analyzed and the second sample to be analyzed respectively by using an embedding word embedding algorithm, so that a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed are obtained.
It should be understood that the first fused vector matrix and the second fused vector matrix are respectively subjected to vector conversion through the pooling layer, and a first conversion vector matrix corresponding to the first fused vector matrix and a second conversion vector matrix corresponding to the second fused vector matrix are obtained.
In the embodiment, the first word vector matrix and the second word vector matrix are obtained by respectively carrying out word vectorization on the first sample to be analyzed and the second sample to be analyzed, and the first fusion vector matrix and the second fusion vector matrix are obtained by respectively carrying out feature fusion on the first word vector matrix and the second word vector matrix through the training network, so that the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, the semantic features of the sentences are enriched, the information interaction between the sentences is more accurate and enriched, and the semantic information of more sentence pairs can be captured.
Example 2:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
and respectively inputting the first word vector matrix and the second word vector matrix into a first dense connection network for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until the last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.
It should be understood that the features from the lowest layer to the highest layer are tightly connected by connecting using the residual network, i.e., the dense connection network, so that the semantic features of the sentences are enriched.
It should be understood that a plurality of said densely connected networks are connected in sequence, i.e. the output of the first densely connected network is connected to the input of the second densely connected network, the output of the second densely connected network is connected … to the input of the third densely connected network until the output of the nth densely connected network is connected to the input of the last densely connected network.
Specifically, the first dense connection network is used for respectively carrying out feature fusion processing on the first word vector matrix and the second word vector matrix to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix, and the first fused vector matrix is taken as the next first word vector matrix, and the second fused vector matrix is taken as the next second word vector matrix, and is respectively input into the next dense connection network, respectively performing feature fusion processing on the next first word vector matrix and the next second word vector matrix through the next dense connection network until the feature fusion processing is input into all the dense connection networks, and obtaining a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix.
In particular, the present invention uses an enhanced residual network connection, representing the input and output of the nth residual network as respectively
Figure BDA0002865496130000111
And
Figure BDA0002865496130000112
let o(0)Representing a sequence of zero vectors, x(1)Is the input vector, x, of the first residual network(1)Is a word vector matrix. The nth residual network x(n)The input of (n ≧ 2) is the first residual network x(1)The input of (a) and the output of the first two residual networks are spliced, the formula is as follows:
Figure BDA0002865496130000113
wherein [; is the vector stitching operation.
In the above embodiment, the first fused vector matrix and the second fused vector matrix are obtained by respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network, so that semantic features of sentences are enriched, and the problems of feature loss, insufficient interaction between sentences and disappearance of network gradients are solved.
Example 3:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
and inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix.
In particular, one of the models that the alignment network was originally modeling sentence pairs based on the attention mechanism achieved up-to-date results on the order of magnitude of the SNLI dataset compared to the other models, and did not rely on word order information.
In the embodiment, the first word vector matrix and the second word vector matrix are respectively input into the first dense connection network for feature fusion processing, so that information interaction between sentences is more accurate and richer, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 4:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
Figure BDA0002865496130000161
wherein e isijFor the ith first spellingSimilarity vector, p, of the patch vector and the jth second patch vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second stitching vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vector by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
Figure BDA0002865496130000162
wherein e isijIs the similarity vector of the ith first stitching vector and the jth second stitching vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
and obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors.
It should be appreciated that the attention mechanism is employed to interact information (i.e., soft-align) between the two sentences.
Specifically, the lengths of the first splicing vector matrix q and the second splicing vector matrix p are respectively set to lqAnd lpThen the input of the two concatenated vector matrices can be represented as
Figure BDA0002865496130000163
And p ═ p (p)1,p2,…,plp)。qiAnd pjSimilarity vector e betweenijThe calculation is the dot product of the projection vector, and the calculation formula is as follows:
eij=F(qi)TF(pj),
calculating the semantic matching vector of each splicing vector through the soft alignment of sentence elements, wherein the formula is as follows:
Figure BDA0002865496130000171
Figure BDA0002865496130000172
in the above embodiment, the alignment network is used to align the first and second spliced vector matrices to obtain the first and second semantic matching vector matrices, so that information interaction between sentences is more accurate and rich, problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 5:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiIs the ith first splicing vector, pj is the jth second splicing vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
Figure BDA0002865496130000191
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
Figure BDA0002865496130000201
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
Figure BDA0002865496130000202
wherein the content of the first and second substances,
Figure BDA0002865496130000203
Figure BDA0002865496130000204
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dThe, an is a dot product operation,
Figure BDA0002865496130000209
which means that the elements are multiplied by each other,
Figure BDA0002865496130000205
are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
and obtaining a first fusion vector matrix according to the plurality of first fusion vectors.
Specifically, the fusion layer compares local and aligned representations from multiple angles, first the various fusion calculation formulas:
gi=sigmoid(q′i)
Figure BDA0002865496130000206
Figure BDA0002865496130000207
Figure BDA0002865496130000208
Figure BDA0002865496130000211
Figure BDA0002865496130000212
Figure BDA0002865496130000213
wherein G is1,G2,G3,G4,G5And G is a single layer feed forward network with independent parameters,
Figure BDA0002865496130000214
the expression elements are multiplied, the difference of sentence pairs is highlighted through subtraction, and the similarity of the sentence pairs is compared through multiplication and cosine calculation. sigmoid is an activation function, u is a dot product operation, gi∈(0,1)dAnd the gate is selected for calculation, so that the key characteristics of the sentence are highlighted.
In the embodiment, the plurality of first fusion vectors are obtained by respectively performing fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors according to the fourth formula, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 6:
a semantic similarity matching method comprises the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is sequentially connected as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is sequentially connected outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
Figure BDA0002865496130000231
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
Figure BDA0002865496130000241
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
Figure BDA0002865496130000242
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002865496130000243
Figure BDA0002865496130000244
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dThe instead of the simple operation is the dot-product operation,
Figure BDA0002865496130000246
which means that the elements are multiplied by each other,
Figure BDA0002865496130000245
are all the ith second initial fusion vector, betaiIs the ith second fusion vector, [;]splicing operation for vectors;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
In the embodiment, the fifth formula is used for respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors to obtain the plurality of second fusion vectors, so that information interaction between sentences is more accurate and rich, the problems of feature loss, insufficient interaction between sentences and network gradient disappearance are solved, and semantic information of more sentence pairs can be captured.
Example 7:
a semantic similarity matching method is characterized by comprising the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the process of performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value comprises the following steps:
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:
Figure BDA0002865496130000251
wherein H is a plurality of layersV. of a feedforward neural network1Is a first matrix of commutation quantities, v2In order to be the second matrix of translation vectors,
Figure BDA0002865496130000252
for the predicted values, the + -, -i. | and ° operations are all performed element by element [;]is a vector stitching operation.
In particular, the prediction layer outputs two sequences from the pooling layer out of the first matrix of switching metrics v1And said second matrix of translation vectors v2As input, a predicted value is obtained, and the formula is:
Figure BDA0002865496130000253
wherein H is a multi-layer feedforward neural network. In the classification task, it is possible to classify the objects,
Figure BDA0002865496130000254
all unnormalized scores are indicated, and C is the classification number. In the regression task, the regression task is carried out,
Figure BDA0002865496130000255
is the predicted Scala value, i.e. the predicted value.
In the paraphrase recognition task, the prediction layer is:
Figure BDA0002865496130000261
a simplified version of a prediction layer network is also provided, the formula being:
Figure BDA0002865496130000262
in the prediction layer, +, -, | and
Figure BDA0002865496130000263
the operation is performed element by element to infer a relationship between two sentences.
In the embodiment, the predicted value is obtained by performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix according to the sixth formula, so that the problems of feature loss, insufficient sentence interaction and disappearance of network gradient are solved, the semantic features of sentences are enriched, the information interaction between sentences is more accurate and abundant, and the semantic information of more sentence pairs can be captured.
Fig. 2 is a block diagram of a semantic similarity matching apparatus according to an embodiment of the present invention.
Example 8:
as shown in fig. 2, a semantic similarity matching apparatus includes:
the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
the feature fusion processing module is used for constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
and the matching result obtaining module is used for carrying out relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result.
Example 9:
a semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor when executing the computer program implementing the semantic similarity matching method as described above. The device may be a computer or the like.
Example 10:
a computer-readable storage medium, storing a computer program which, when executed by a processor, implements the semantic similarity matching method as described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. It will be understood that the technical solution of the present invention essentially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A semantic similarity matching method is characterized by comprising the following steps:
importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value, and taking the predicted value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network respectively for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix comprises the following steps:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijA similarity vector of the ith first splicing vector and the jth second splicing vector is obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
Figure FDA0003547745250000031
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
Figure FDA0003547745250000032
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
Figure FDA0003547745250000041
wherein the content of the first and second substances,
Figure FDA0003547745250000042
Figure FDA0003547745250000043
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dAs a dot product operation, ° denotes element multiplication,
Figure FDA0003547745250000044
are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
obtaining a first fusion vector matrix according to the plurality of first fusion vectors;
the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix comprises the following steps:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
Figure FDA0003547745250000045
wherein the content of the first and second substances,
Figure FDA0003547745250000046
Figure FDA0003547745250000047
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dAs a dot product operation, ° denotes element multiplication,
Figure FDA0003547745250000048
are all the ith second initial fusion vector, betaiIs the ith second fusion vector, [;]splicing operation for vectors;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
2. The semantic similarity matching method according to claim 1, wherein the predicting the correlation between the first conversion vector matrix and the second conversion vector matrix to obtain a predicted value comprises:
performing relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix through a sixth expression to obtain a predicted value, wherein the sixth expression is as follows:
Figure FDA0003547745250000051
where H is a multilayered feedforward neural network, v1For the first rotationQuantity matrix, v2In order to be the second matrix of translation vectors,
Figure FDA0003547745250000052
is the sum of the predicted values, +, -, |
Figure FDA0003547745250000053
The operations are performed element by element [;]is a vector stitching operation.
3. A semantic similarity matching apparatus, comprising:
the word vectorization module is used for importing a first sample to be analyzed and a second sample to be analyzed, and respectively carrying out word vectorization processing on the first sample to be analyzed and the second sample to be analyzed to obtain a first word vector matrix corresponding to the first sample to be analyzed and a second word vector matrix corresponding to the second sample to be analyzed;
the feature fusion processing module is used for constructing a training network, and respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the vector conversion module is used for respectively carrying out vector conversion on the first fusion vector matrix and the second fusion vector matrix to obtain a first conversion vector matrix corresponding to the first fusion vector matrix and a second conversion vector matrix corresponding to the second fusion vector matrix;
a matching result obtaining module, configured to perform relevance prediction processing on the first conversion vector matrix and the second conversion vector matrix to obtain a prediction value, and use the prediction value as a matching result;
the training network comprises a plurality of densely connected networks which are connected in sequence;
in the feature fusion processing module, the process of respectively performing feature fusion processing on the first word vector matrix and the second word vector matrix through the training network to obtain a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix includes:
inputting the first word vector matrix and the second word vector matrix into a first dense connection network respectively for feature fusion processing, taking a result output by the first dense connection network which is connected in sequence as the input of a next dense connection network, and performing feature fusion processing until a last dense connection network which is connected in sequence outputs a first fusion vector matrix corresponding to the first word vector matrix and a second fusion vector matrix corresponding to the second word vector matrix;
the dense connection network comprises a bidirectional long-short term memory network, an alignment network and a fusion network;
in the feature fusion processing module, the process of inputting the first word vector matrix and the second word vector matrix into the first dense connection network for feature fusion processing includes:
inputting the first word vector matrix and the second word vector matrix into the bidirectional long and short term memory network respectively, and coding the first word vector matrix and the second word vector matrix through the bidirectional long and short term memory network respectively to obtain a first context semantic feature vector matrix corresponding to the first word vector matrix and a second context semantic feature vector matrix corresponding to the second word vector matrix;
splicing the first word vector matrix and the first context semantic feature vector matrix to obtain a first spliced vector matrix;
splicing the second word vector matrix and the second context semantic feature vector matrix to obtain a second spliced vector matrix;
respectively inputting the first splicing vector matrix and the second splicing vector matrix into the alignment network, and respectively aligning the first splicing vector matrix and the second splicing vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first splicing vector matrix and a second semantic matching vector matrix corresponding to the second splicing vector matrix;
inputting the first splicing vector matrix and the first semantic matching vector matrix into the fusion network, and performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix;
inputting the second splicing vector matrix and the second semantic matching vector matrix into the fusion network, and performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix;
the first stitching vector matrix comprises a plurality of first stitching vectors, the second stitching vector matrix comprises a plurality of second stitching vectors,
in the feature fusion processing module, the process of respectively aligning the first spliced vector matrix and the second spliced vector matrix through the alignment network to obtain a first semantic matching vector matrix corresponding to the first spliced vector matrix and a second semantic matching vector matrix corresponding to the second spliced vector matrix includes:
respectively calculating the similarity of the plurality of first splicing vectors and the plurality of second splicing vectors through a first equation to obtain a plurality of similarity vectors, wherein the first equation is as follows:
eij=F(qi)TF(pj),
wherein F () is a single layer feedforward network, qiFor the ith first stitching vector, pjIs the jth second stitching vector, T is the transpose, eijSimilarity vectors of the ith first splicing vector and the jth second splicing vector are obtained;
respectively calculating a plurality of similarity vectors and a plurality of first semantic matching vectors of the first splicing vector by a second formula to obtain a plurality of first semantic matching vectors:
Figure FDA0003547745250000071
wherein e isijIs the similarity vector, p, of the ith first stitching vector and the jth second stitching vectorjIs the jth second stitching vector, q'iFor the first semantic matching vector, lpIs the second concatenation vector matrix length;
respectively calculating a plurality of similarity vectors and a plurality of second semantic matching vectors of the first splicing vectors by a third formula to obtain a plurality of second semantic matching vectors, wherein the third formula is as follows:
Figure FDA0003547745250000072
wherein e isijIs the similarity vector of the ith first splicing vector and the jth second splicing vector, qiFor the ith first stitching vector, lqIs the first concatenation vector matrix length, p'iMatching the vector for the second semantic;
obtaining a first semantic matching vector matrix according to the plurality of first semantic matching vectors, and obtaining a second semantic matching vector matrix according to the plurality of second semantic matching vectors;
in the feature fusion processing module, the process of performing fusion calculation on the first splicing vector matrix and the first semantic matching vector matrix through the fusion network to obtain a first fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of first splicing vectors and the plurality of first semantic matching vectors by a fourth formula to obtain a plurality of first fusion vectors, wherein the fourth formula is as follows:
Figure FDA0003547745250000081
wherein the content of the first and second substances,
Figure FDA0003547745250000082
Figure FDA0003547745250000083
wherein, gi=sigmoid(q′i),
Wherein, q'iFor the first semantic matching vector, qiFor the ith first stitching vector, G, G1,G2、G3、G4And G5Are all single-layer feedforward networks, sigmoid is an activation function, gi∈(0,1)dThe instead of the simple operation is the dot-product operation,
Figure FDA0003547745250000088
which means that the elements are multiplied by each other,
Figure FDA0003547745250000084
are all the ith first initial fusion vector, αiIs the ith first fusion vector, [;]splicing operation for vectors;
obtaining a first fusion vector matrix according to the plurality of first fusion vectors;
in the feature fusion processing module, the process of performing fusion calculation on the second splicing vector matrix and the second semantic matching vector matrix through the fusion network to obtain a second fusion vector matrix includes:
respectively carrying out fusion calculation on the plurality of second splicing vectors and the plurality of second semantic matching vectors through a fifth formula to obtain a plurality of second fusion vectors, wherein the fifth formula is as follows:
Figure FDA0003547745250000085
wherein the content of the first and second substances,
Figure FDA0003547745250000086
Figure FDA0003547745250000087
wherein, g'i=sigmoid(p′i),
Wherein, p'iFor the second semantic matching vector, piIs the ith second stitching vector, G'1,G′2、G′3、G′4And G'5Are single-layer feedforward networks, and sigmoid is an activation function, g'i∈(0,1)dAs a dot product operation, ° denotes element multiplication,
Figure FDA0003547745250000091
are all the ith second initial fusion vector, betaiIs the ith second fusion vector, [;]splicing operation for the vector;
and obtaining a second fusion vector matrix according to the plurality of second fusion vectors.
4. A semantic similarity matching apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that when the computer program is executed by the processor, the semantic similarity matching method according to any one of claims 1 to 2 is implemented.
5. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the semantic similarity matching method according to any one of claims 1 to 2.
CN202011579327.0A 2020-12-28 2020-12-28 Semantic similarity matching method and device and storage medium Active CN112560502B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011579327.0A CN112560502B (en) 2020-12-28 2020-12-28 Semantic similarity matching method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011579327.0A CN112560502B (en) 2020-12-28 2020-12-28 Semantic similarity matching method and device and storage medium

Publications (2)

Publication Number Publication Date
CN112560502A CN112560502A (en) 2021-03-26
CN112560502B true CN112560502B (en) 2022-05-13

Family

ID=75033914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011579327.0A Active CN112560502B (en) 2020-12-28 2020-12-28 Semantic similarity matching method and device and storage medium

Country Status (1)

Country Link
CN (1) CN112560502B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158624B (en) * 2021-04-09 2023-12-08 中国人民解放军国防科技大学 Method and system for fine tuning pre-training language model by fusing language information in event extraction
CN113656066B (en) * 2021-08-16 2022-08-05 南京航空航天大学 Clone code detection method based on feature alignment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348014A (en) * 2019-07-10 2019-10-18 电子科技大学 A kind of semantic similarity calculation method based on deep learning
CN110765755A (en) * 2019-10-28 2020-02-07 桂林电子科技大学 Semantic similarity feature extraction method based on double selection gates
CN110826338A (en) * 2019-10-28 2020-02-21 桂林电子科技大学 Fine-grained semantic similarity recognition method for single-choice gate and inter-class measurement
CN110930219A (en) * 2019-11-14 2020-03-27 电子科技大学 Personalized merchant recommendation method based on multi-feature fusion
CN111241295A (en) * 2020-01-03 2020-06-05 浙江大学 Knowledge map relation data extraction method based on semantic syntax interactive network
CN111310438A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10388272B1 (en) * 2018-12-04 2019-08-20 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348014A (en) * 2019-07-10 2019-10-18 电子科技大学 A kind of semantic similarity calculation method based on deep learning
CN110765755A (en) * 2019-10-28 2020-02-07 桂林电子科技大学 Semantic similarity feature extraction method based on double selection gates
CN110826338A (en) * 2019-10-28 2020-02-21 桂林电子科技大学 Fine-grained semantic similarity recognition method for single-choice gate and inter-class measurement
CN110930219A (en) * 2019-11-14 2020-03-27 电子科技大学 Personalized merchant recommendation method based on multi-feature fusion
CN111241295A (en) * 2020-01-03 2020-06-05 浙江大学 Knowledge map relation data extraction method based on semantic syntax interactive network
CN111310438A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information;Seonhoon Kim 等;《https://arxiv.org/pdf/1085.11360.pdf》;20181102;1-11 *
Simple and Effective Text Matching with Richer Alignment Features;Runqi Yang 等;《roceedings of the 57th Annual Meeting of the Association for Computational Linguistics》;20190802;4699-4709 *
基于密集连接网络和多维特征融合的文本匹配模型;陈岳林 等;《浙 江 大 学 学 报 (工学版)》;20211231;2352-2358 *

Also Published As

Publication number Publication date
CN112560502A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN109947912B (en) Model method based on intra-paragraph reasoning and joint question answer matching
CN108334487B (en) Missing semantic information completion method and device, computer equipment and storage medium
Rastogi et al. Scalable multi-domain dialogue state tracking
CN111368565B (en) Text translation method, text translation device, storage medium and computer equipment
CN108052512B (en) Image description generation method based on depth attention mechanism
CN109783817B (en) Text semantic similarity calculation model based on deep reinforcement learning
CN109661664B (en) Information processing method and related device
CN111831789B (en) Question-answering text matching method based on multi-layer semantic feature extraction structure
CN110781306B (en) English text aspect layer emotion classification method and system
Zhang et al. Exploring question understanding and adaptation in neural-network-based question answering
CN112131366A (en) Method, device and storage medium for training text classification model and text classification
CN111581973A (en) Entity disambiguation method and system
CN110457718B (en) Text generation method and device, computer equipment and storage medium
CN109740158B (en) Text semantic parsing method and device
CN112560502B (en) Semantic similarity matching method and device and storage medium
CN110232122A (en) A kind of Chinese Question Classification method based on text error correction and neural network
CN111191002A (en) Neural code searching method and device based on hierarchical embedding
CN109582970B (en) Semantic measurement method, semantic measurement device, semantic measurement equipment and readable storage medium
CN112183085A (en) Machine reading understanding method and device, electronic equipment and computer storage medium
WO2019220113A1 (en) Device and method for natural language processing
WO2022134793A1 (en) Method and apparatus for extracting semantic information in video frame, and computer device
CN116204674B (en) Image description method based on visual concept word association structural modeling
CN113743119A (en) Chinese named entity recognition module, method and device and electronic equipment
JP2017010249A (en) Parameter learning device, sentence similarity calculation device, method, and program
CN114417823A (en) Aspect level emotion analysis method and device based on syntax and graph convolution network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant