CN109948163A

CN109948163A - The natural language semantic matching method that sequence dynamic is read

Info

Publication number: CN109948163A
Application number: CN201910228242.9A
Authority: CN
Inventors: 陈恩红; 刘淇; 张琨; 吕广奕; 吴乐
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2019-03-25
Filing date: 2019-03-25
Publication date: 2019-06-28
Anticipated expiration: 2039-03-25
Also published as: CN109948163B

Abstract

The invention discloses a kind of natural language semantic matching methods that sequence dynamic is read, comprising: carries out semantic modeling to each word of natural language sentences centering, obtaining the corresponding word level semantics of each natural language sentences indicates vector；Vector is indicated according to each word level semantics, and corresponding sentence semantics expression vector is obtained by stacking neural network and considers complementary hidden layer expression vector between word；Vector is indicated using sentence semantics and considers the dynamic understanding that complementary hidden layer expression vector between word carries out sentence semantics, and the dynamic for obtaining corresponding sentence, which understands, indicates vector；The sentence semantics of natural language sentences pair are indicated that the dynamic of vector and sentence understands expression vector, are respectively integrated, and realize that the semantic relation of natural language sentences pair is classified according to integrated results.This method can be read by the dynamic of distich subsequence and realize the subsemantic accurate understanding of distich and expression, and then realize the accurate judgement to natural language semantic matches.

Description

The natural language semantic matching method that sequence dynamic is read

Technical field

The present invention relates to oneself of deep learning and natural language understanding technology field more particularly to a kind of dynamic reading of sequence Right language semantic matching process.

Background technique

Natural language sentences semantic matches (Sentence Semantic Matching) are natural language processing fields The research contents of one basis but key, the main problem solved is the semantic relation judged between two sentences.Such as In natural language inference (Natural Language Inference, NLI), sentence semantics matching is mainly used for judging hypothetical sentence Semanteme whether can be inferred from premise sentence come.In repeating identification (Paraphrase Identification, PI), sentence Sub- semantic matches are mainly used for judging whether two sentences are same semantic in expression.Therefore the task matter of utmost importance to be solved It is the semantic expressiveness that therefore task matter of utmost importance to be solved is natural language sentences.The semantic table of natural language sentences Show it is natural language processing even one basis of artificial intelligence field but extremely important research contents, either basic information Retrieval, semantics extraction, or complicated question answering system, conversational system require have a standard comprehensively to the semanteme of input sentence True expression just can guarantee that machine understands the language system of mankind's complexity in this way.Researcher has been proposed a variety of different Semantic expressiveness learning method, wherein imitate the attention mechanism learning method of the Attention behavior of the mankind by more and more Concern.Attention mechanism can help to select word those of important to semantic expressiveness in sentence, while can not be long by sentence Dependence in the limitation modeling sentence of degree between word and word, provides important technical support for the semantic expressiveness of sentence. More complicated, researcher proposes bull attention mechanism (Multi-Head attention), by considering different situations, from Different angle models sentence semantics, thus realize to sentence semantics more comprehensively, it is more accurate to understand expression.Therefore, sharp With attention mechanism to natural language semantic expressiveness carry out research have become natural language field exploration one it is particularly significant Research direction.

Currently, mainly having the following contents to the research of natural language sentences semantic expressiveness using attention mechanism:

In biological cognitive science, attention mechanism can help people be absorbed in in the maximally related content of target.Therefore, Mainly pass through a variety of different neural network structures, mould using method of the attention mechanism to natural language sentences semantic expressiveness The attention method of bionical object, such as interior attention (inner-attention), bull attention (multi-head Attention), mutual attention (co-attention), the modeling of the methods of direction attention (directional attention) Word and word in sentence, semantic dependency and matching relationship between sentence then again by different neural network structures, such as are rolled up Product neural network (Convolutional Neural Network, CNN), Recognition with Recurrent Neural Network (Recurrent Neural Network, RNN) etc. integrate these information, obtain the final expression of sentence semantics, and apply it to different specific tasks In the middle.

It is above-mentioned that the method for sentence semantic expressiveness is defaulted to the word sequence in entire sentence using the realization of attention mechanism According to being handled from left to right or from the reading method turned left of the right side.And cognitive science is studies have shown that in real life, the mankind There is very big difference using attention mechanism processing sequence.Researcher has found people's meeting when reading by eyeball tracking instrument Ignore some words, and the mankind can select different focus according to that he is seen and that he intentionally gets and read suitable Sequence.Further research has shown that the mankind only can be concerned about 1.5 words every time in deep reading, and the mankind once at most pay close attention to The target different to 7.These all demonstrate human attention mechanism and only focus on seldom a part of content every time, and pass through Accurate understanding to its meaning is realized to the concern repeatedly of important content.

Summary of the invention

The object of the present invention is to provide a kind of natural language semantic matching methods that sequence dynamic is read, and can pass through distich The dynamic of subsequence, which is read, realizes the subsemantic accurate understanding of distich and expression, and then realizes the standard to natural language semantic matches Really judgement.

The purpose of the present invention is what is be achieved through the following technical solutions:

A kind of natural language semantic matching method that sequence dynamic is read, comprising:

Semantic modeling is carried out to each word of natural language sentences centering, obtains the corresponding word grade of each natural language sentences Other semantic expressiveness vector；

Vector is indicated according to each word level semantics, obtains corresponding sentence semantics expression vector by stacking neural network And consider complementary hidden layer expression vector between word；

Vector is indicated using sentence semantics and considers complementary hidden layer expression vector progress sentence between word Semantic dynamic understands that the dynamic for obtaining corresponding sentence, which understands, indicates vector；

The sentence semantics of natural language sentences pair are indicated that the dynamic of vector and sentence understands expression vector, are respectively carried out whole It closes, and realizes that the semantic relation of natural language sentences pair is classified according to integrated results.

As seen from the above technical solution provided by the invention, for natural language sentences, the attention of the mankind is used for reference Mechanism makes full use of the dynamic of distich subsequence to read, and the accurate selection to heavy duty word in sentence may be implemented and repeat to understand, It is final to realize the matched accurate judgement of sentence semantics to realize to sentence semantics High Efficiency Modeling and characterization.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.

Fig. 1 is a kind of process of natural language semantic matching method that sequence dynamic is read provided in an embodiment of the present invention Figure.

Specific embodiment

With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, belongs to protection scope of the present invention.

The embodiment of the present invention provides a kind of natural language semantic matching method that sequence dynamic is read, as shown in Figure 1, it is led Include:

Step 11 carries out semantic modeling to each word of natural language sentences centering, obtains each natural language sentences pair The word level semantics answered indicate vector.

The preferred embodiment of this step is as follows:

1) natural language sentences pair are indicated using unified mathematical form: the natural language sentences to include two from Right language sentence, a natural language sentences are denoted asIndicate it by l_aThe text of a word composition, separately One natural language sentences is denoted asIndicate it by l_bThe text of a word composition；Wherein,It is right That answers respectively indicates natural language sentences s^aIn i-th of word, natural language sentences s^bIn j-th of word.

2) natural language sentences are to s^aWith s^bIn all words constitute a dictionary V, size l^vIt indicates；Natural language Say sentence to s^aAnd s^bEach of word all use one solely hot vector (one-hot vector) indicate that vector length is word The size of allusion quotation V, only its corresponding index position in dictionary V is 1 in only hot vector of each word, other are 0；Herein On the basis of, the character representation of each word namely the word justice table of pre-training are obtained using the good term vector matrix E of pre-training Show:

Wherein,Corresponding to natural language sentences s^aIn i-th of word, natural language sentences s^bIn j-th it is single The semantic expressiveness of the pre-training of word；

3) natural language sentences are assumed to being English text, then by the dictionary of one character set of all English alphabet compositions V^c, size 26；Each of word letter indicates that vector length is dictionary V with an only hot vector^cSize, each Only have it in dictionary V in only hot vector of letter^cIn corresponding index position be 1, other are 0；On this basis, using one Dimension convolution handles the alphabetical sequence of word respectively, and different convolution kernels (unigram, bigram, trigram) specifically can be used Sentence is handled, is then operated using maximum pondization, to finally obtain each other semantic expressiveness of word character level:

Wherein, E^cIndicate that the vector representing matrix of the character of needs training, Conv1D indicate one-dimensional convolution operation, Maxpooling indicates maximum pondization operation,Corresponding expression natural language sentences s^aIn i-th of word i-th_cIt is a Letter only hotlist show, natural language sentences s^bIn j-th of word jth_cOnly hotlist of a letter shows；

4) each word is more fully indicated in order to more acurrate, by the character representation of the word obtained in the pre-training and corresponding The other semantic expressiveness of word character level be stitched together, then using two layers high speed network (Highway network) integrate These information, to finally obtain the semantic expressiveness vector of each word in natural language sentences:

Wherein, Highway () indicates high speed network structure, a_i、b_jCorresponding expression natural language sentences s^aI-th single Semantic expressiveness vector, the natural language sentences s of word^bIn j-th of word semantic expressiveness vector.

Step 12 indicates vector according to each word level semantics, obtains corresponding sentence semantics by stacking neural network It indicates vector and considers complementary hidden layer expression vector between word.

The preferred embodiment of this step is as follows:

1) mankind can be by reading understanding of the sentence intensification to the sentence semantics, therefore, in order to more comprehensively repeatedly Modeling sentence give information, using stack Recognition with Recurrent Neural Network (Stack Recurrent Neural Network, Stack-RNN) entire natural language sentences are modeled, obtain the implicit shape of each word in each natural language sentences State sequence: it uses door recirculating network (GRU) and is used as basic unit, for the input x of t moment_t, the hidden state h of GRU_t It updates as follows:

Z=σ (x_tU^z+h_t-1W^z)

R=σ (x_tU^r+h_t-1W^r)

Wherein, z, r, c^mIt is update door, the resetting door, memory unit of GRU respectively；U^zWith W^zFor update door parameter matrix, U^rWith W^rFor the parameter matrix for resetting door, U^hWith W^hFor the parameter matrix of memory unit,Indicate dot product；x_tIndicate natural language sentence Sub- s^aOr s^bIn t-th of word semantic expressiveness vector；σ indicates Sigmoid activation operation.

On this basis, the subsemantic repeat reading of distich and reason are realized by the GRU of stacked multilayer (i.e. stack-GRU) Solution is more completely understood sentence semantics to realize.But with the intensification of the network number of plies, model can not retain all acquired Information, while be also faced with gradient disappear or explosion (gradient vanish or explore) problem.

In order to avoid problem above, the input of each layer of GRU and hidden layer output are spliced together by the embodiment of the present invention, make Input for next layer:

Wherein, GRU_lIndicate l layers of GRU,Indicate t-th of hidden layer state of (l-1) layer GRU,Indicate (l- 1) t-th of input of layer GRU, symbol [,] indicate concatenation；It is all to guarantee that model can retain for operation in this way Information, while avoiding the problem that gradient disappears or explodes to a certain extent.

Later, using the mutation of this stack-GRU to natural language sentences to reading and understanding repeatedly are carried out, thus more The semantic expressiveness for comprehensively encoding each word in each sentence obtains interdepending between word in each natural language sentences Hidden layer indicate that formula is as follows:

Wherein,Corresponding expression natural language sentences s^aIn i-th ' a word, natural language sentences s^bMiddle jth ' A word sentence level semantic expressiveness,Indicate natural language sentences s^aIn from the semantic table of the 1st phrase rank Show the set of the semantic expressiveness of i-th ' a phrase rank,Indicate natural language sentences s^bIn from the 1st phrase rank Semantic expressiveness to jth ' a phrase rank semantic expressiveness set；

2) what is obtained above is complementary hidden layer expression between word in each sentence, and the semantic expressiveness of each word The semantic expressiveness influence degree of entire sentence is different, and attention mechanism can help model to select and semantic expressiveness pass The highest additional information of connection degree.

In order to guarantee the subsemantic accurate understanding of distich and expression, in the embodiment of the present invention, using from attention mechanism (self-attention) semantic expressiveness of each word is obtained to the weighing factor of the semantic expressiveness of final sentence, and uses this A little weights do weighted sum to the hidden layer state expression of all words, to obtain sentence semantics expression；

α^a=ω^T tanh(WA^a+b)

Wherein, ω, W are from the weight in the calculating of attention mechanism, and b is to belong to from the biasing in the calculating of attention mechanism Parameter during model training, α^aIt indicates to natural language sentences s^aUse the weight obtained from after attention mechanism point Cloth, i-th, a element of k ' be respectivelyh^aIndicate natural language sentences s^aSemantic expressiveness vector；

Similarly, to natural language sentences s^bUsing identical operation, natural language sentences s is obtained^bSemantic expressiveness vector h^b。

By this step, the embodiment of the present invention has obtained indicating by repeat reading and the sentence semantics of understanding, Yi Jikao The hidden layer state for considering complementary each word between word indicates.

Step 13, indicate vector using sentence semantics and consider complementary hidden layer between word indicate vector into The dynamic of row sentence semantics understands that the dynamic for obtaining corresponding sentence, which understands, indicates vector.

The preferred embodiment of this step is as follows:

1) previously mentioned, what the mankind can pay close attention to when understanding sentence semantics according to the information seen and the selection of desired information Content, some words will not be read, and other word can be repeated reading.Meanwhile one is only focused on very in each concern Small range.Therefore, the embodiment of the present invention proposes to select a most important word at each moment using a selection function, And the most important word selected is handled by GRU, corresponding hidden layer state is obtained, continues to use selection function on this basis The most important word of subsequent time is selected, is handled using GRU, obtains the hidden layer state of subsequent time, and repeat the process, directly To reaching maximum dynamic sequence reading length.The last one hidden layer state will be understood by the dynamic as sentence indicates vector.By Input of GRU is uncertain during this, and the information for needing to be grasped before basis calculates current input content, because This process is referred to as dynamic and reads, and the sequence which goes out is referred to as dynamic and reads sequence:

Wherein, F indicates selection function,It indicates to correspond to natural language sentences s^aT-1 moment dynamic read sequence Hidden layer state, h^bIndicate natural language sentences s^bSemantic expressiveness vector because the model be directed to sentence semantics matching, because This is in processing natural language sentences s^aWhen need natural language sentences s^bSemantic expressiveness vector h^bAs additional supplemental information It takes into account, l_TIt indicates that dynamic reads the length of sequence, is previously set, v^aIndicate natural language sentences s^aDynamic reason Solution indicates vector；

Similarly, to natural language sentences s^bUsing identical operation, natural language sentences s is obtained^bDynamic understand indicate to Measure v^b。

As previously mentioned, consider that attention mechanism can help model to select and semantic expressiveness correlation degree highest one A word or several words can select most important word at each moment to realize, in the embodiment of the present invention, use is another Kind attention mechanism selects the most important word of t moment

Wherein, ω_d,W_d,U_d,M_dIt indicates the weight in the calculating of attention mechanism, belongs to the parameter during model training,Indicate the semantic expressiveness of each word to the weighing factor distribution vector of the semantic expressiveness of final sentence, The corresponding index value of the maximum value of weighing factor is selected in expression,Expression one is all 1 row vector, due toIt is a selection operation, is that gradient is not opened to guarantee that entire model can be led by softmax function Above formula is modified as follow form by hair, the embodiment of the present invention:

Wherein, β is an arbitrarily large positive integer, it is contemplated that the characteristic of softmax function, arbitrarily large multiplied by one After positive integer, the corresponding weight in more important position more levels off to 1, other weights more level off to 0.Pass through this kind of mode, this hair Bright embodiment realizes guidable most important selected ci poem extract operation.

The sentence semantics of natural language sentences pair are indicated that the dynamic of vector and sentence understands expression vector by step 14, respectively Realize that the semantic relation of natural language sentences pair is classified from being integrated, and according to integrated results.

The preferred embodiment of this step is as follows:

1) it in the embodiment of the present invention, is indicated using the sentence semantics that didactic method integrates natural language sentences pair respectively The dynamic of vector and sentence, which understands, indicates that vector specifically can choose expression vector dot, subtract each other, and the operations such as splicing will These characterization vectors integrate, and obtaining the sentence semantics between natural language sentences pair indicates the dynamic reason of vector h and sentence Solution indicates vector v, is then found out under conditions of given different aspect information by multi-layer perception (MLP) (MLP), natural language sentence The probability of semantic relation of the son between, the above process indicate are as follows:

H=(h^a,h^b,h^b⊙h^a,h^b-h^a),

V=(v^a,v^b,v^b⊙v^a,v^b-v^a),

p^h=MLP₁(h),

p^v=MLP₁(v),

Wherein, ⊙ indicates dot product ,-indicate to subtract each other, () indicates concatenation；p^hIt indicates to utilize the nature after integration Sentence semantics between language sentence pair indicate the semantic relation probability for the natural language sentences pair that vector h is calculated；p^vTable Show and understands the natural language sentence for indicating that vector v is calculated using the dynamic of the sentence between the natural language sentences pair after integration The semantic relation probability of son pair.

MLP is a three-decker, defeated comprising two layers of full articulamentum and ReLu activation primitive and one layer of softmax Layer out.In this layer, concatenation can retain the characterizing semantics information of sentence, available two sentences of dot product to greatest extent Similarity information between son, the available characterizing semantics of phase reducing are different degrees of in each dimension.

2) sentence semantics more fully and are accurately understood in order to realize, by merging p^hWith p^vAnd again to natural language Semantic relation between speech sentence is classified: calculating p by linear transformation^hWith p^vShared weight, is then weighted and asks With final semantic relation probability is acquired finally by another multi-layer perception (MLP) MLP:

P(y|(s^a,s^b))=MLP₂(α_hp^h+α_vp^v)

Wherein, ω_h,ω_vIt is p^hWith p^vWeighting parameter in shared weight computations, b^h,b^vFor corresponding offset parameter, σ indicates sigmoid function, α_h、α_vCorresponding expression p^h、p^vShared weight；P(y|(s^a,s^b)) indicate natural language sentences pair s^aWith s^bBetween semantic relation probability distribution.

Above scheme provided in an embodiment of the present invention, not only by stacking readding repeatedly for Recognition with Recurrent Neural Network distich subsequence It reads, realizes the subsemantic more thorough understanding of distich；And sequential structure is read to the essence of primary word in sentence by dynamic It really selects and reads repeatedly, to realize the subsemantic more fully accurate understanding of distich and expression, and then High Efficiency Modeling two Semantic interaction between a sentence, the semantic deduction relationship between two sentences of final accurate judgement, while additionally providing one kind Accurate sentence semantics characterizing method compensates for existing method insufficient present on sentence semantics expression.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can The mode of necessary general hardware platform can also be added to realize by software by software realization.Based on this understanding, The technical solution of above-described embodiment can be embodied in the form of software products, which can store non-easy at one In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.

The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Within the technical scope of the present disclosure, any changes or substitutions that can be easily thought of by anyone skilled in the art, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims Subject to enclosing.

Claims

1. a kind of natural language semantic matching method that sequence dynamic is read characterized by comprising

Semantic modeling is carried out to each word of natural language sentences centering, obtains the corresponding word rank language of each natural language sentences Justice indicates vector；

Indicate vector according to each word level semantics, by stack neural network obtain corresponding sentence semantics indicate vector and Considering complementary hidden layer between word indicates vector；

Vector is indicated using sentence semantics and considers complementary hidden layer expression vector progress sentence semantics between word Dynamic understand, obtain corresponding sentence dynamic understand indicate vector；

The sentence semantics of natural language sentences pair are indicated that the dynamic of vector and sentence understands expression vector, are respectively integrated, And realize that the semantic relation of natural language sentences pair is classified according to integrated results.

2. a kind of natural language semantic matching method that sequence dynamic is read according to claim 1, which is characterized in that institute It states and semantic modeling is carried out to each word of natural language sentences centering, obtain the corresponding word level semantics of each natural language sentences Indicate vector the step of include:

To including two natural language sentences, a natural language sentences are denoted as the natural language sentencesIndicate it by l_aThe text of a word composition, another natural language sentences are denoted asIndicate it by l_bThe text of a word composition；Wherein,It is corresponding to respectively indicate natural language Sentence s^aIn i-th of word, natural language sentences s^bIn j-th of word；

Natural language sentences are to s^aWith s^bIn all words constitute a dictionary V, size l^vIt indicates；Natural language sentences To s^aAnd s^bEach of word all indicated with an only hot vector, vector length be dictionary V size, each word it is only Only its corresponding index position in dictionary V is 1 in hot vector, other are 0；On this basis, good using pre-training Term vector matrix E obtains the character representation of each word namely the word semantic expressiveness of pre-training:

Wherein,Corresponding to natural language sentences s^aIn i-th of word, natural language sentences s^bIn j-th word The semantic expressiveness of pre-training；

Assuming that all English alphabets are then formed the dictionary V of a character set to for English text by natural language sentences^c, big Small is 26；Each of word letter indicates that vector length is dictionary V with an only hot vector^cSize, it is each letter Only have it in dictionary V in only hot vector^cIn corresponding index position be 1, other are 0；On this basis, using one-dimensional convolution The alphabetical sequence of word is handled respectively, is then operated using maximum pondization, to finally obtain each other language of word character level Justice indicates:

Wherein, E^cIndicate that the vector representing matrix of the character of needs training, Conv1D indicate one-dimensional convolution operation, Maxpooling Indicate maximum pondization operation,Corresponding expression natural language sentences s^aIn i-th of word i-th_cOnly heat of a letter It indicates, natural language sentences s^bIn j-th of word jth_cOnly hotlist of a letter shows；

The character representation of the word obtained in the pre-training is stitched together with the corresponding other semantic expressiveness of word character level again, Then these information are integrated using two layers of high speed network, to finally obtain the semantic table of each word in natural language sentences Show vector:

Wherein, Highway () indicates high speed network structure, a_i、b_jCorresponding expression natural language sentences s^aI-th word Semantic expressiveness vector, natural language sentences s^bIn j-th of word semantic expressiveness vector.

3. a kind of natural language semantic matching method that sequence dynamic is read according to claim 2, which is characterized in that institute Stating indicates vector according to each word level semantics, obtains corresponding sentence semantics expression vector by stacking neural network and examines Having considered the step of complementary hidden layer indicates vector between word includes:

Entire natural language sentences are modeled using Recognition with Recurrent Neural Network is stacked, are obtained every in each natural language sentences The hidden state sequence of a word: door recirculating network GRU is used as basic unit, for the input x of t moment_t, The hidden state h of GRU_tIt updates as follows:

Z=σ (x_tU^z+h_t-1W^z)

R=σ (x_tU^r+h_t-1W^r)

Wherein, z, r, c^mIt is update door, the resetting door, memory unit of GRU respectively；U^zWith W^zFor the parameter matrix for updating door, U^rWith W^rFor the parameter matrix for resetting door, U^hWith W^hFor the parameter matrix of memory unit,Indicate dot product；x_tIndicate natural language sentences s^a Or s^bIn t-th of word semantic expressiveness vector；σ indicates Sigmoid activation operation；

On this basis, by stacked multilayer GRU, i.e. stack-GRU, the input of each layer of GRU and hidden layer output are spliced to Together, as next layer of input:

Wherein, GRU_lIndicate l layers of GRU,Indicate t-th of hidden layer state of (l-1) layer GRU,Indicate (l-1) layer T-th of input of GRU, symbol [,] indicate concatenation；

Using stack-GRU to natural language sentences to reading and understanding repeatedly are carried out, obtain single in each natural language sentences Complementary hidden layer indicates between word, and formula is as follows:

Wherein,Corresponding expression natural language sentences s^aIn i-th ' a word, natural language sentences s^bMiddle jth ' a list Word sentence level semantic expressiveness,Indicate natural language sentences s^aIn from the semantic expressiveness of the 1st phrase rank to The set of the semantic expressiveness of i-th ' a phrase rank,Indicate natural language sentences s^bIn from the language of the 1st phrase rank Justice indicates the set of the semantic expressiveness to jth ' a phrase rank；

Using obtaining the semantic expressiveness of each word from attention mechanism to the weighing factor of the semantic expressiveness of final sentence, and make It indicates to do weighted sum with hidden layer state of these weights to all words, to obtain sentence semantics expression；

Wherein, ω, W are from the weight in the calculating of attention mechanism, and b is to belong to mould from the biasing in the calculating of attention mechanism Parameter in type training process, α^aIt indicates to natural language sentences s^aIt is distributed using the weight obtained after attention mechanism, h^aTable Show natural language sentences s^aSemantic expressiveness vector；

4. a kind of natural language semantic matching method that sequence dynamic is read according to claim 3, which is characterized in that institute It states using sentence semantics expression vector and considers complementary hidden layer expression vector progress sentence semantics between word Dynamic understands that the dynamic for obtaining corresponding sentence understands that the step of indicating vector includes:

Select a most important word at each moment using a selection function, and handled by GRU select it is most heavy The word wanted obtains corresponding hidden layer state, continues to use selection function on this basis and selects the most important word of subsequent time, benefit It is handled with GRU, obtains the hidden layer state of subsequent time, and repeat the process, until reaching the maximum dynamic sequence of setting Reading length, finally understanding the last one hidden layer state as the dynamic of sentence indicates vector:

Wherein, F indicates selection function,It indicates to correspond to natural language sentences s^aThe t-1 moment dynamic read sequence hidden layer State, h^bIndicate natural language sentences s^bSemantic expressiveness vector, l_TIndicate that dynamic reads the length of sequence, v^aIndicate nature language Say sentence s^aDynamic understand indicate vector；

Similarly, to natural language sentences s^bUsing identical operation, natural language sentences s is obtained^bDynamic understand indicate vector v^b。

5. a kind of natural language semantic matching method that sequence dynamic is read according to claim 4, which is characterized in that

The most important word of t moment is selected using another attention mechanism

Wherein, ω_d,W_d,U_d,M_dWeight in the calculating of attention mechanism, belongs to the parameter during model training,Indicate every The semantic expressiveness of a word to the weighing factor distribution vector of the semantic expressiveness of final sentence,Expression is selected The corresponding index value of the maximum value of weighing factor,Indicate one be all 1 row vector, β is one arbitrarily large just whole Number.

6. a kind of natural language semantic matching method that sequence dynamic is read according to claim 4, which is characterized in that will The sentence semantics of natural language sentences pair indicate that the dynamic of vector and sentence understands expression vector, are respectively integrated, and according to Integrated results realize that the step of semantic relation classification of natural language sentences pair includes:

Indicate that the dynamic of vector and sentence understands using the sentence semantics that didactic method integrates natural language sentences pair respectively Indicate vector, obtaining the sentence semantics between natural language sentences pair indicates that the dynamic of vector sum sentence understands expression vector, so It is found out under conditions of given different aspect information by multi-layer perception (MLP) MLP afterwards, the semantic pass between natural language sentences pair The probability of system, the above process indicate are as follows:

H=(h^a,h^b,h^b⊙h^a,h^b-h^a),

V=(v^a,v^b,v^b⊙v^a,v^b-v^a),

p^h=MLP₁(h),

p^v=MLP₁(v),

Wherein, ⊙ indicates dot product ,-indicate to subtract each other, () indicates concatenation；p^hIt indicates to utilize the natural language sentence after integration Sentence semantics of the son between indicate the semantic relation probability for the natural language sentences pair that vector h is calculated；p^vIt indicates to utilize The dynamic of the sentence between natural language sentences pair after integration understands the natural language sentences pair for indicating that vector v is calculated Semantic relation probability；

By merging p^hWith p^vAnd classify again to the semantic relation between natural language sentences: being calculated by linear transformation P out^hWith p^vThen shared weight is weighted summation, acquire final language finally by another multi-layer perception (MLP) MLP Adopted relationship probability:

P(y|(s^a,s^b))=MLP₂(α_hp^h+α_vp^v)

Wherein, ω_h,ω_vIt is p^hWith p^vWeighting parameter in shared weight computations, b^h,b^vFor corresponding offset parameter, σ table Show sigmoid function, α_h、α_vCorresponding expression p^h、p^vShared weight；P(y|(s^a,s^b)) indicate natural language sentences to s^aWith s^bBetween semantic relation probability distribution.