CN110019685A - Depth text matching technique and device based on sequence study - Google Patents

Depth text matching technique and device based on sequence study Download PDF

Info

Publication number
CN110019685A
CN110019685A CN201910285853.7A CN201910285853A CN110019685A CN 110019685 A CN110019685 A CN 110019685A CN 201910285853 A CN201910285853 A CN 201910285853A CN 110019685 A CN110019685 A CN 110019685A
Authority
CN
China
Prior art keywords
sentence
reasoning
vector
positive
centering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910285853.7A
Other languages
Chinese (zh)
Other versions
CN110019685B (en
Inventor
李健铨
刘小康
刘子博
晋耀红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Science and Technology (Beijing) Co., Ltd.
Original Assignee
Beijing Shenzhou Taiyue Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenzhou Taiyue Software Co Ltd filed Critical Beijing Shenzhou Taiyue Software Co Ltd
Priority to CN201910285853.7A priority Critical patent/CN110019685B/en
Publication of CN110019685A publication Critical patent/CN110019685A/en
Application granted granted Critical
Publication of CN110019685B publication Critical patent/CN110019685B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

This application provides a kind of depth text matching techniques and device based on sequence study, specifically, first, obtain the sentence pair being made of hypothesis sentence, reasoning sentence, wherein, reasoning sentence includes positive reasoning sentence and multiple negative reasoning sentences, it is assumed that sentence is related to the semanteme of positive reasoning sentence, uncorrelated with the semanteme of negative reasoning sentence;Then, after the sentence of sentence centering being performed corresponding processing, sentence vector is formed, further according to the matching degree value between each sentence vector, calculates the penalty values of default loss function, and according to the penalty values, the parameter of percentage regulation Matching Model;Finally, adjusting finally obtained depth Matching Model using parameter, text matches are carried out to the sentence of input.The application will input sentence to by two sentences to being extended to sentence sequence, and include positive example and the two kinds of data of negative example, since the input number amount and type of extended model help to improve the matching precision of model so that the fitting speed of model is accelerated.

Description

Depth text matching technique and device based on sequence study
Technical field
This application involves natural language processing technique field more particularly to a kind of depth text matches based on sequence study Method and device.
Background technique
Text matches are an important underlying issues in natural language processing, and many tasks in natural language processing are all Text matches task can be abstracted as.For example, Webpage search can be abstracted as the correlation that webpage searches for Query with user With problem, automatic question answering can be abstracted as the satisfaction matching problem of candidate answers and problem, and text duplicate removal can be abstracted as text With the similarity mode problem of text.
Traditional text matching techniques (such as vector space model in information retrieval), mainly solve lexical level Matching problem.And in fact, the matching algorithm based on vocabulary registration has significant limitation, can not solve the problems, such as it is very much, Such as the composite structure problem (such as " from Beijing to Shanghai high-speed rail " and " from Shanghai to Beijing of the ambiguity synonym problem of language, language High-speed rail ") and matched asymmetric problem (such as the language expression form at the end query and page end are past in Webpage search task Toward with very big difference).
Depth learning technology rise after, based on neural metwork training go out Word Embedding (word insertion vector) come into This matching primitives of composing a piece of writing cause extensive interest.The training method of Word Embedding is more succinct, and resulting word The semantic computability that language vector indicates further strengthens.But only obtained using the training of no labeled data WordEmbedding is not much different on the practical function that matching degree calculates with topic model technology, they are inherently bases In the training of co-occurrence information.In addition, Word Embedding itself does not have without solving the problems, such as the semantic expressiveness of phrase, sentence yet Solve the problems, such as matched asymmetry.
Based on the above issues, it is currently suggested the neural network depth Matching Model of supervision, to be promoted in terms of semantic matches The effect of calculation, such as DSSM (Deep Structured Semantic Models, deep semantic Matching Model), CDSSM (Convolutional Latent Semantic Model, convolution latent semantic model), ESIM (Enhancing Sequential Inference Model, enhancing sequence inference pattern) etc..Wherein, existing depth Matching Model is in training When mostly use sentence to matching.And above-mentioned by the way of sentence pair, sentences all more similar for multiple and trained sentence, Model cannot judge which sentence is more like, and then influence the final matching effect of model.
Summary of the invention
Based on existing sentence to training method there are the shortcomings that, this application provides it is a kind of based on sequence study depth Text matching technique and device.
According to the embodiment of the present application in a first aspect, provide it is a kind of based on sequence study depth text matching technique, Applied to depth Matching Model, which comprises
Obtain the sentence pair that is made of hypothesis sentence, reasoning sentence, wherein the reasoning sentence include positive reasoning sentence and Multiple negative reasoning sentences, it is described to assume that sentence is related to the semanteme of positive reasoning sentence, uncorrelated with the semanteme of negative reasoning sentence;
The sentence difference word vector of the sentence centering is indicated, the term vector square of each sentence of sentence centering is obtained Battle array;
Using similarity matrix corresponding to each term vector matrix, the sentence for generating the sentence centering is similar to each other Property weighting after sentence vector;
According to the matching degree between each sentence vector, the penalty values of default loss function are calculated;
According to the penalty values, the parameter of the depth Matching Model is adjusted;
Using the finally obtained depth Matching Model of parameter adjustment institute, text matches are carried out to the sentence of input.
Optionally, according to matching degree value between each sentence vector, the penalty values of default loss function, packet are calculated It includes:
It calculates separately between sentence vector corresponding to the hypothesis sentence and positive reasoning sentence and each negative reasoning sentence Matching degree value;
The associated losses function formed using Pointwise loss function and Listwise loss function is calculated each described Penalty values between sentence Vectors matching degree value and standard value.
Optionally, the calculation formula of the associated losses function loss are as follows: loss=Lp+Ll+ L2Regularization, Wherein:
LpFor Pointwise loss function, Lp=max (0, m-s (rh;rp+)+s(rh;rp-));LlFor Listwise loss Function,
rhTo assume that the sentence vector of sentence indicates, rp+And rp-It is to be positive to push away respectively Manage the sentence vector expression of sentence and negative reasoning sentence, s (rh;rp+) it is to assume sentence vector corresponding to sentence and positive reasoning language Cosine similarity, s (rh;rp) it is the cosine similarity for assuming sentence vector corresponding to sentence and reasoning language, m is preset Determine the threshold value of positive and negative reasoning sentence, n is the number of samples being made of positive reasoning sentence and negative reasoning sentence.
Optionally, the sentence pair being made of hypothesis sentence, reasoning sentence is obtained, comprising:
It chooses by as hypothesis sentence and positive reasoning sentence and semantic relevant two positive example sentences;
It chooses by as the incoherent multiple negative illustrative phrase sentences of semanteme negative reasoning sentence and with the positive example sentence;
Two positive example sentences and each negative illustrative phrase sentence are formed into sentence pair.
Optionally, the sentence difference word vector of the sentence centering is indicated, obtains each sentence of sentence centering Term vector matrix, comprising:
The sentence of the sentence centering is carried out segmenting respectively and word vector indicates, obtains initial word vector matrix;
Part of speech, co-occurrence information and position encoded vector are added to the initial word vector matrix, obtain the sentence pair In each sentence term vector matrix.
Optionally, after the sentence vector after generating the sentence mutual similarities weighting of the sentence centering, the method Further include:
Acquired each language after the hypothesis sentence is weighted with positive reasoning sentence, each negative reasoning sentence mutual similarities respectively Sentence vector, is normalized, and obtains the corresponding sentence vector of the hypothesis sentence.
Optionally, using similarity matrix corresponding to each term vector matrix, the sentence of the sentence centering is generated Sentence vector after mutual similarities weighting, comprising:
Using similarity matrix corresponding to each term vector matrix, the sentence for generating the sentence centering is similar to each other Property weighting after initial statement vector;
According to the context of sentence corresponding to each initial statement vector, each sentence vector is compiled again Code, obtains the sentence vector of each sentence of sentence centering.
Optionally, according to the penalty values, the parameter of the depth Matching Model is adjusted, comprising:
To minimize the penalty values as target, the parameter of the depth Matching Model is adjusted.
According to the second aspect of the embodiment of the present application, a kind of depth text matches device based on sequence study is provided, Applied to depth Matching Model, described device includes:
Sentence is to acquisition module: for obtaining the sentence pair being made of hypothesis sentence, reasoning sentence, wherein the reasoning Sentence includes positive reasoning sentence and multiple negative reasoning sentences, and the hypothesis sentence is related to the semanteme of positive reasoning sentence, pushes away with negative The semanteme for managing sentence is uncorrelated;
Term vector representation module: for indicating the sentence difference word vector of the sentence centering, the sentence is obtained The term vector matrix of the sentence of centering;
Similar weighting block: for generating the sentence using similarity matrix corresponding to each term vector matrix Sentence vector after the sentence mutual similarities weighting of centering;
Penalty values computing module: for calculating default loss function according to the matching degree between each sentence vector Penalty values;
Model parameter adjusts module: for adjusting the parameter of the depth Matching Model according to the penalty values;
Text matches module: for using the finally obtained depth Matching Model of parameter adjustment institute, to the sentence of input into Row text matches.
Optionally, the penalty values computing module, comprising:
Similarity calculated: for calculating separately the hypothesis sentence and positive reasoning sentence and each negative reasoning sentence Matching degree value between corresponding sentence vector;
Penalty values computing unit: the joint for being formed using Pointwise loss function and Listwise loss function Loss function calculates the penalty values between each sentence Vectors matching degree value and standard value.
As seen from the above technical solution, the depth text matching technique and dress provided in this embodiment based on sequence study Set, in training depth Matching Model, by the adjustment to model, make the centering of mode input sentence not only and include to assume sentence with Positive reasoning sentence, also comprising the sentence with the incoherent multiple negative reasoning sentences compositions of hypothesis sentence and positive reasoning statement semantics It is right.In this way, will input sentence to by two input sentences to being extended to sentence sequence, and include positive example and negative example two types Data so that the fitting speed of model is accelerated, help to enhance model due to the input number amount and type of extended model Generalization ability.In addition, the present embodiment utilizes the parameter of loss function percentage regulation Matching Model, so that ultimate depth matches mould The highest sentence of matching degree probability of type output has incorporated the thought of sequence, in turn to assume sentence and positive reasoning sentence Can make parameter adjustment finally obtained model text matches accuracy rate it is higher.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not It can the limitation present invention.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention Example, and be used to explain the principle of the present invention together with specification.
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, for those of ordinary skill in the art Speech, without any creative labor, is also possible to obtain other drawings based on these drawings.
Fig. 1 is that a kind of basic procedure of depth text matching technique based on sequence study provided by the embodiments of the present application shows It is intended to;
Fig. 2 is a kind of basic structure schematic diagram of depth Matching Model provided by the embodiments of the present application;
Fig. 3 a is the schematic diagram of the information vector that will increase and term vector bit-wise addition provided by the embodiments of the present application;
Fig. 3 b is the schematic diagram that the information vector provided by the embodiments of the present application that will increase is connected to term vector;
When Fig. 4 is the progress feature extraction provided by the embodiments of the present application using two-way LSTM, weight and not shared power are shared The difference schematic diagram of value;
Fig. 5 is the schematic diagram provided by the embodiments of the present application that feature selecting is carried out using convolutional neural networks;
When Fig. 6 is that two-way LSTM provided by the embodiments of the present application carries out feature extraction, the different way of outputs is selected to illustrate Figure;
Fig. 7 is that a kind of basic structure of depth text matches device based on sequence study provided by the embodiments of the present application is shown It is intended to.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended The example of device and method being described in detail in claims, some aspects of the invention are consistent.
It mostly uses sentence to matching for existing depth Matching Model, there is a problem of matching similarity inaccuracy, this Embodiment provide it is a kind of based on sequence study depth text matching technique apply in depth Matching Model, wherein the party Method is applicable to various depth Matching Models.
Fig. 1 is that a kind of basic procedure of depth text matching technique based on sequence study provided by the embodiments of the present application shows It is intended to.As shown in Figure 1, this method specifically comprises the following steps:
S110: the sentence pair being made of hypothesis sentence, reasoning sentence is obtained, wherein the reasoning sentence includes positive reasoning Sentence and multiple negative reasoning sentences, it is described assume that the semanteme of sentence and positive reasoning sentence is related and the semanteme of negative reasoning sentence not It is related.
Fig. 2 is a kind of basic structure schematic diagram of depth Matching Model provided by the embodiments of the present application.As shown in Fig. 2, should Depth Matching Model is mainly by input layer, expression layer, alternation of bed, feature selecting layer, coding layer, matching layer and output layer group At it should be noted that the method provided in this embodiment depth Matching Model that it is not limited to this structure, can also be other knots Structure is usually still with input layer, expression layer, alternation of bed, matching layer and output layer for basic structure.
In model training, the sentence centering that existing mode commonly enters only includes two sentences, is denoted as sentence A, sentence Sub- B has that matching result accuracy is low.Therefore, the present embodiment also inputs several other than input sentence A, sentence B With the semantic incoherent sentence of sentence A, sentence B, wherein sentence A, sentence B recognize in this embodiment to be used as positive example, assumes language Sentence and positive reasoning sentence, several incoherent sentences of semanteme are used as negative example, i.e. each negative reasoning sentence.In addition, being born in the present embodiment The quantity of example is unrestricted, and negative example can be the sample generated at random in other matching sentences pair.
For example, input sentence sample is as follows:
Assuming that sentence: sun today;
Positive reasoning sentence: today, weather was fine;
Negative reasoning sentence 1: today rains heavily;
Negative reasoning sentence 2: ...
Further, since coding of the depth Matching Model for each sentence individually carries out, to increase data input Amount, the present embodiment is inputted role's reversed order of sentence A and B twice, specific as follows:
Firstly, choosing by as sentence and positive reasoning sentence, and semantic relevant two positive example sentences is assumed, such as sentence A With sentence B;Then, it chooses by the incoherent multiple negative illustrative phrase sentences of semanteme as negative reasoning sentence, and with the positive example sentence, Such as sentence C, sentence D ...;Finally, positive example sentence is chosen from two positive example sentences respectively as assuming sentence, another A positive example sentence forms sentence pair as positive reasoning sentence, and with each negative illustrative phrase sentence.In this way, input sentence centering just include < Sentence A, sentence B, sentence C, sentence D ...>,<sentence B, sentence A, sentence C, sentence D ...>.Then, to each sentence centering Sentence carry out word segmentation processing.
S120: the sentence difference word vector of the sentence centering is indicated, the word of each sentence of sentence centering is obtained Vector matrix.
Input data is segmented, then indicates word using trained WordEmbedding model, wherein Word Embedding model uses the models such as word2vec, glove.
In order to increase amount of input information, the present embodiment also adds some information vectors on the basis of term vector, In, including part of speech, co-occurrence information and position encoded vector.Specifically, the representation method of every kind of vector are as follows:
Part of speech vector: every kind of part of speech is indicated using the random vector of a regular length
Co-occurrence information vector: co-occurrence information refers to hypothesis and infers the word that sentence occurs jointly, such as above-mentioned hypothesis and just " today " word in reasoning sentence.In the present embodiment, co-occurrence information has 0,1,2 three kind of expression, wherein 0: representing<PAD> Increased sentence dimension, i.e. sentence this position this without value, in order to which the null value that depth Matching Model is filled up can be put into;1: representing The word occurs jointly in sentence and word;2: representing the word assuming that and inferring in sentence do not occur jointly.The present embodiment The vector that co-occurrence information vector is one-dimensional length is set.
Position encoded vector: position encoded usable formula calculates, it is possible to use the vector for the random initializtion that can learn To indicate.
Specifically, according to the position encoded vector that formula calculates following formula can be used:
In formula (1) and (2), pos indicates position of the participle in input sentence, d1Indicate the dimension of term vector, C is Periodic coefficient, PE(pos2i)Indicate position encoded, the PE of the 2i dimension of the participle of os position of pth(pos2i+1)Indicate pth os The 2i+1 dimension of the participle of a position it is position encoded.
In addition, when indicating the mode of position encoded vector using the vector for the random initializtion that can learn, it can be by one The vector of a random initializtion is input in model, and model understands oneself study and adjusts the vector to a relatively more reasonable value, and Using vector adjusted as position encoded vector.
After obtaining above-mentioned part of speech, co-occurrence information and position encoded vector, term vector can be added it to, wherein this Embodiment name is obtained being initial term vector by WordEmbedding, obtains vector after adding above-mentioned vector as term vector. Specifically, being can choose on addition manner by above-mentioned vector and initial word addition of vectors, Fig. 3 a is calculated as the embodiment of the present application and mentions The schematic diagram by the information vector increased and term vector bit-wise addition supplied, alternatively, can also be connected to above-mentioned vector initially A longer vector is formed after term vector, Fig. 3 b is that the information vector provided by the embodiments of the present application that will increase is connected to word The schematic diagram of vector.
S130: using similarity matrix corresponding to each term vector matrix, generate the sentence centering sentence that Sentence vector after the weighting of this similitude.
In alternation of bed corresponding to model in Fig. 2, using Attention mechanism, the similarity matrix of each sentence pair is obtained, The present embodiment carries out matrix multiplication using the term vector representing matrix of two sentences and obtains the matrix.And according to the similarity matrix The expression of sentence centering assumed H and infer P is regenerated, which is working as it can be appreciated that after term vector expression It is recompiled under preceding context, obtains new sentence expression.
Following formula (3) and (4).
In formula (3) and (4), lenH and len (P) respectively refer to the length of two sentences,WithFor the sentence after weighting Subrepresentation,WithFor original sentence expression, e is weight, the respective value acquisition by similarity matrix.
It should be noted that a variety of sentence interaction Attention mechanism are used equally for the present embodiment.This example uses two-way LSTM (Long-Short-TermMemory, shot and long term memory) structure, representation formula are as follows:
yt=gVAt+V'A'tFormula (5)
At=f (Uxt+WAt-1) formula (6)
A't=f (U'xt+W'At-1) formula (7)
In formula (5) into (7), V, V', U', U, W, W' are weight matrix, and f, g are activation primitive, and x is input, and A is hidden State parameter, y are output, and t is the moment.
Using above-mentioned two-way LSTM structure, firstly, two sentences of each sentence centering are carried out word alignment, two sentences are obtained Between similarity matrix;Then, the local reasoning for carrying out two words with similarity matrix obtained above and combines sentence pair In two sentences, mutually generate mutual similarities weighting after sentence.In addition, if the syntactic analysis of sentence can be done, Here tree-like LSTM also can be used in two-way LSTM.
S140: according to the matching degree value between each sentence vector, the penalty values of default loss function are calculated.
In matching layer and output layer corresponding to model in Fig. 2, it is false to calculate separately each sentence centering obtained above If the matching degree value of the sentence vector of the sentence vector sum reasoning sentence P of sentence H, available N number of output valve, in Fig. 2 Score1, Score2 ... ScoreN, wherein N is the quantity including positive example and negative example of all reasoning sentences.It is then possible to Loss function is calculated according to the ranking results of N number of output valve, adjust model parameter and continues to train, wherein is calculated to reduce Whether amount, it is to assume sentence and positive reasoning sentence that it is highest, which can to only focus on matching degree,.
For the better above-mentioned matching degree value of evaluation, the present embodiment merges Pointwise and Listwise thought, tool Body, the associated losses function formed using Pointwise loss function and Listwise loss function calculates each sentence vector Gap value between matching degree value and standard value, and according to the gap value, the parameter of percentage regulation Matching Model.Wherein, The calculation formula of Pointwise loss function is as follows:
Lp=max (0, m-s (rh;rp+)+s(rh;rp-)) formula (8)
In formula (8), s (rh;rp+) it is the cosine similarity for assuming sentence vector corresponding to sentence and positive reasoning language, s(rh;rp) it is the cosine similarity for assuming sentence vector corresponding to sentence and reasoning language, m is the preset positive and negative reasoning language of judgement The threshold value of sentence, n is the number of samples being made of positive reasoning sentence and negative reasoning sentence.
According to above-mentioned formula it is found that Pointwise loss function is assuming that the matching degree of sentence and positive reasoning sentence is low When, corresponding penalty values are big, and when assuming sentence and high negative reasoning statement matching degree, corresponding penalty values are also big.Therefore, It is applied alone Pointwise loss function to have a preferable sequence effect, but the value of similarity and not accurate enough.Based on above-mentioned original Cause, the present embodiment have also combined Listwise loss function, and calculation formula is as follows:
In order to prevent to the over-fitting of model, L2 canonical is added in the present embodiment in loss function (L2Regularization), the final associated losses function loss obtained is as follows:
Loss=Lp+Ll+ L2Regularization formula (10)
S150: according to the penalty values, the parameter of the depth Matching Model is adjusted.
Specifically, can be minimized above-mentioned penalty values is target in training process, model is constantly trained, is obtained final Depth Matching Model
S160: using the finally obtained depth Matching Model of parameter adjustment institute, text matches are carried out to the sentence of input.
For example, can use continuous parameter adjusts obtained depth Matching Model, input by sentence in test set extremely should Text matches are carried out in model, and calculate its matching accuracy rate.
Depth text matching technique provided in this embodiment based on sequence study, in training depth Matching Model, When training depth Matching Model, by the adjustment to model, making the centering of mode input sentence not only includes to assume sentence and just push away Sentence is managed, also comprising the sentence pair with the incoherent multiple negative reasoning sentences compositions of hypothesis sentence and positive reasoning statement semantics.This Sample, input, to being extended to sentence to sequence and including positive example and the two kinds of data of negative example, are extended by two input sentences The input number amount and type of model facilitate the generalization ability for enhancing model so that the fitting speed of model is accelerated.Separately Outside, the present embodiment also incorporates sequence thought in model, when using the parameter of loss function percentage regulation Matching Model, so that The highest sentence of matching degree probability of depth Matching Model output to being target for the hypothesis sentence and positive reasoning sentence, into And the text matches accuracy rate of model is higher after parameter can be made to adjust.Finally, the present embodiment also merges Attention mechanism, it is raw At the sentence vector after two sentence mutual similarities weightings of each sentence centering, due to between two sentences of a sentence centering Word is associated with, and then the performance level of model can be made to get a promotion.
As shown in Fig. 2, in depth Matching Model provided in this embodiment comprising input layer, alternation of bed, matching layer and Outside the data processing of output layer, it is additionally provided with expression layer, feature selecting layer and coding layer, corresponding, depth text matches side Method further includes following steps other than comprising above-mentioned steps:
Firstly, further including feature after in the step s 120 indicating two sentence difference word vectors of each sentence centering The step of extraction, according to the context where each term vector in sentence, again encodes each term vector that is, in expression layer, Obtain the new term vector expression of the sentence of the sentence centering.
Specifically, the step can using various features drawing-out structure carry out, as convolutional neural networks (CNN, Convolutional NeuralNetwork), recurrent neural network (RNN, RecurrentNeuralNetwork), Attention mechanism etc..The present embodiment still uses two-way LSTM structure, and Fig. 4 is provided by the embodiments of the present application using two-way When LSTM carries out feature extraction, the difference schematic diagram of weight and not shared weight is shared, as shown in figure 4, the vacation in feature extraction If weight can be shared with reasoning sentence, can not also share, in specific implementation process, can be required according to training speed and Amount of training data is selected.
Further, due to by after the sentence difference word vector expression of each sentence centering, corresponding in the step s 120 In N number of reasoning sentence, it is assumed that the available N number of term vector of sentence indicates that, for convenience of subsequent operation, the present embodiment is selected in feature It selects in layer, also this needs that the hypothesis available N number of term vector content of sentence is normalized herein.
Such as Fig. 2, the model is by the way of most basic averaging:
In formula (11), N is the quantity of all hypothetical sentences,Term vector for the hypothesis language of expression layer output indicates.
Certainly, in the specific implementation process, in addition to aforesaid way, it can also be and model is carried out using the weight that can learn Weighted sum mode, alternatively, feature extraction can also be carried out using convolutional neural networks, recurrent neural network etc..Fig. 5 is this Shen Please the schematic diagram that feature selecting is carried out using convolutional neural networks that provides of embodiment, as shown in figure 5, multiple term vectors are lateral Convolution is carried out using convolutional neural networks after splicing, then carries out pondization output.
Further, using similarity matrix corresponding to each term vector matrix, the language of the sentence centering is generated After sentence vector after sentence mutual similarities weighting, also it is handled as follows:
According to the context where each term vector in sentence, each term vector is encoded again, is obtained described The new term vector of the sentence of sentence centering indicates.
Specifically, the present embodiment still carries out feature extraction and coding using two-way LSTM structure, Fig. 6 is the application implementation When the two-way LSTM that example provides carries out feature extraction, different way of output schematic diagrames is selected, as shown in fig. 6, the present embodiment can It is indicated using using to the hidden state outcome of LSTM structure output as new term vector, alternatively, utilizing two-way LSTM each moment Output, step-by-step maximizing, mean value and be connected to new term vector respectively and indicate.
Depth Matching Model is obtained using above method training, 94% can be reached in certain existing financial corpus test set Accuracy, in identical training set and test set, conventional model accuracy is only 88%.Therefore, side provided in this embodiment Method has carried out a series of improvement, the results show to model training process, and the modelling effect that this method training obtains is higher than Conventional method.
Based on the above method, the present embodiment additionally provides a kind of depth text matches device based on sequence study.Fig. 7 is The basic structure schematic diagram of a kind of depth text matches device based on sequence study provided by the embodiments of the present application, such as Fig. 7 institute Show, which includes:
Sentence is to acquisition module 710: for obtaining the sentence pair being made of hypothesis sentence, reasoning sentence, wherein described to push away Reason sentence includes positive reasoning sentence and multiple negative reasoning sentences, and the hypothesis sentence is related and negative to the semanteme of positive reasoning sentence The semanteme of reasoning sentence is uncorrelated.
Term vector representation module 720: for indicating the sentence difference word vector of the sentence centering, the sentence is obtained The term vector matrix of the sentence of sub- centering.
Similar weighting block 730: for generating the sentence using similarity matrix corresponding to each term vector matrix Sentence vector after the sentence mutual similarities weighting of sub- centering.
Penalty values computing module 740: for calculating default loss letter according to the matching degree between each sentence vector Several penalty values.
Model parameter adjusts module 750: for adjusting the parameter of the depth Matching Model according to the penalty values.
Text matches module 760: for adjusting the finally obtained depth Matching Model of institute using parameter, to the sentence of input Carry out text matches.
Further, penalty values computing module 740 further include:
Similarity calculated: for calculating separately the hypothesis sentence and positive reasoning sentence and each negative reasoning sentence Between sentence Vectors matching degree value;
Costing bio disturbance unit: the joint for being formed using Pointwise loss function and Listwise loss function is damaged Function is lost, the penalty values between each sentence Vectors matching degree value and standard value are calculated.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device or For system embodiment, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to method The part of embodiment illustrates.Apparatus and system embodiment described above is only schematical, wherein as separation The unit of part description may or may not be physically separated, component shown as a unit can be or It can not be physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to reality Border needs to select some or all of the modules therein to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art It can understand and implement without creative efforts.
The above is only a specific embodiment of the invention, it is noted that those skilled in the art are come It says, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications also should be regarded as Protection scope of the present invention.

Claims (10)

1. a kind of depth text matching technique based on sequence study, is applied to depth Matching Model, which is characterized in that the side Method includes:
Obtain the sentence pair that is made of hypothesis sentence, reasoning sentence, wherein the reasoning sentence includes positive reasoning sentence and multiple Negative reasoning sentence, it is described to assume that sentence is related to the semanteme of positive reasoning sentence, uncorrelated with the semanteme of negative reasoning sentence;
The sentence difference word vector of the sentence centering is indicated, the term vector matrix of each sentence of sentence centering is obtained;
Using similarity matrix corresponding to each term vector matrix, the sentence mutual similarities for generating the sentence centering add Sentence vector after power;
According to the matching degree value between each sentence vector, the penalty values of default loss function are calculated;
According to the penalty values, the parameter of the depth Matching Model is adjusted;
Using the finally obtained depth Matching Model of parameter adjustment institute, text matches are carried out to the sentence of input.
2. the method according to claim 1, wherein being counted according to matching degree value between each sentence vector Calculate the penalty values of default loss function, comprising:
Calculate separately between sentence vector corresponding to the hypothesis sentence and positive reasoning sentence and each negative reasoning sentence With degree value;
The associated losses function formed using Pointwise loss function and Listwise loss function, calculates each sentence Penalty values between Vectors matching degree value and standard value.
3. according to the method described in claim 2, it is characterized in that, the calculation formula of the associated losses function loss are as follows: Loss=Lp+Ll+ L2Regularization, in which:
LpFor Pointwise loss function, Lp=max (0, m-s (rh;rp+)+s(rh;rp-));LlFor Listwise loss function,rhTo assume that the sentence vector of sentence indicates, rp+And rp-Be be positive respectively reasoning sentence and The sentence vector expression of negative reasoning sentence, s (rh;rp+) it is the cosine phase for assuming sentence vector corresponding to sentence and positive reasoning language Like degree, s (rh;rp) it is the cosine similarity for assuming sentence vector corresponding to sentence and reasoning language, m is that preset judgement is positive and negative The threshold value of reasoning sentence, n are the number of samples being made of positive reasoning sentence and negative reasoning sentence.
4. the method according to claim 1, wherein obtain the sentence pair that is made of hypothesis sentence, reasoning sentence, Include:
It chooses by as hypothesis sentence and positive reasoning sentence and semantic relevant two positive example sentences;
It chooses by as the incoherent multiple negative illustrative phrase sentences of semanteme negative reasoning sentence and with the positive example sentence;
Two positive example sentences and each negative illustrative phrase sentence are formed into sentence pair.
5. the method according to claim 1, wherein the sentence of the sentence centering is distinguished word vector table Show, obtain the term vector matrix of each sentence of sentence centering, comprising:
The sentence of the sentence centering is carried out segmenting respectively and word vector indicates, obtains initial word vector matrix;
Part of speech, co-occurrence information and position encoded vector are added to the initial word vector matrix, it is each to obtain the sentence centering The term vector matrix of sentence.
6. the method according to claim 1, wherein generating the sentence mutual similarities weighting of the sentence centering After sentence vector afterwards, the method also includes:
By the hypothesis sentence respectively with acquired each sentence after positive reasoning sentence, the weighting of each negative reasoning sentence mutual similarities to Amount, is normalized, and obtains the corresponding sentence vector of the hypothesis sentence.
7. the method according to claim 1, wherein utilizing similarity moment corresponding to each term vector matrix Battle array, the sentence vector after generating the sentence mutual similarities weighting of the sentence centering, comprising:
Using similarity matrix corresponding to each term vector matrix, the sentence mutual similarities for generating the sentence centering add Initial statement vector after power;
According to the context of sentence corresponding to each initial statement vector, each sentence vector is encoded again, is obtained To the sentence vector of each sentence of sentence centering.
8. the method according to claim 1, wherein adjusting the depth Matching Model according to the penalty values Parameter, comprising:
To minimize the penalty values as target, the parameter of the depth Matching Model is adjusted.
9. a kind of depth text matches device based on sequence study, is applied to depth Matching Model, which is characterized in that the dress It sets and includes:
Sentence is to acquisition module: for obtaining the sentence pair being made of hypothesis sentence, reasoning sentence, wherein the reasoning sentence It is described to assume semantic related and negative reasoning language of the sentence to positive reasoning sentence including positive reasoning sentence and multiple negative reasoning sentences The semanteme of sentence is uncorrelated;
Term vector representation module: for indicating the sentence difference word vector of the sentence centering, the sentence centering is obtained Sentence term vector matrix;
Similar weighting block: for generating the sentence centering using similarity matrix corresponding to each term vector matrix Sentence mutual similarities weighting after sentence vector;
Penalty values computing module: for calculating the damage of default loss function according to the matching degree between each sentence vector Mistake value;
Model parameter adjusts module: for adjusting the parameter of the depth Matching Model according to the penalty values;
Text matches module: for carrying out text to the sentence of input using the finally obtained depth Matching Model of parameter adjustment institute This matching.
10. device according to claim 9, which is characterized in that the penalty values computing module, comprising:
Similarity calculated: right for calculating separately the hypothesis sentence and positive reasoning sentence and each negative reasoning sentence institute The matching degree value between sentence vector answered;
Penalty values computing unit: the associated losses for being formed using Pointwise loss function and Listwise loss function Function calculates the penalty values between each sentence Vectors matching degree value and standard value.
CN201910285853.7A 2019-04-10 2019-04-10 Deep text matching method and device based on sequencing learning Active CN110019685B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910285853.7A CN110019685B (en) 2019-04-10 2019-04-10 Deep text matching method and device based on sequencing learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910285853.7A CN110019685B (en) 2019-04-10 2019-04-10 Deep text matching method and device based on sequencing learning

Publications (2)

Publication Number Publication Date
CN110019685A true CN110019685A (en) 2019-07-16
CN110019685B CN110019685B (en) 2021-08-20

Family

ID=67190939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910285853.7A Active CN110019685B (en) 2019-04-10 2019-04-10 Deep text matching method and device based on sequencing learning

Country Status (1)

Country Link
CN (1) CN110019685B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457444A (en) * 2019-08-14 2019-11-15 山东浪潮人工智能研究院有限公司 A kind of sentence of same meaning conversion method based on depth text matches
CN110705283A (en) * 2019-09-06 2020-01-17 上海交通大学 Deep learning method and system based on matching of text laws and regulations and judicial interpretations
CN110795934A (en) * 2019-10-31 2020-02-14 北京金山数字娱乐科技有限公司 Sentence analysis model training method and device and sentence analysis method and device
CN110969006A (en) * 2019-12-02 2020-04-07 支付宝(杭州)信息技术有限公司 Training method and system of text sequencing model
CN111027320A (en) * 2019-11-15 2020-04-17 北京三快在线科技有限公司 Text similarity calculation method and device, electronic equipment and readable storage medium
CN111368903A (en) * 2020-02-28 2020-07-03 深圳前海微众银行股份有限公司 Model performance optimization method, device, equipment and storage medium
CN111898362A (en) * 2020-05-15 2020-11-06 联想(北京)有限公司 Data processing method and device
CN112560427A (en) * 2020-12-16 2021-03-26 平安银行股份有限公司 Problem expansion method, device, electronic equipment and medium
CN113361259A (en) * 2021-06-04 2021-09-07 浙江工业大学 Service flow extraction method
CN113935329A (en) * 2021-10-13 2022-01-14 昆明理工大学 Asymmetric text matching method based on adaptive feature recognition and denoising
CN114065729A (en) * 2021-11-16 2022-02-18 神思电子技术股份有限公司 Text sorting method based on deep text matching model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509463A (en) * 2017-02-28 2018-09-07 华为技术有限公司 A kind of answer method and device of problem
US20180349477A1 (en) * 2017-06-06 2018-12-06 Facebook, Inc. Tensor-Based Deep Relevance Model for Search on Online Social Networks
CN109086423A (en) * 2018-08-08 2018-12-25 北京神州泰岳软件股份有限公司 A kind of text matching technique and device
CN109101537A (en) * 2018-06-27 2018-12-28 北京慧闻科技发展有限公司 More wheel dialogue data classification methods, device and electronic equipment based on deep learning
CN109145292A (en) * 2018-07-26 2019-01-04 黑龙江工程学院 Paraphrasing text depth Matching Model construction method and paraphrasing text Matching Method of Depth
CN109344404A (en) * 2018-09-21 2019-02-15 中国科学技术大学 The dual attention natural language inference method of context aware
CN109471945A (en) * 2018-11-12 2019-03-15 中山大学 Medical file classification method, device and storage medium based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509463A (en) * 2017-02-28 2018-09-07 华为技术有限公司 A kind of answer method and device of problem
US20180349477A1 (en) * 2017-06-06 2018-12-06 Facebook, Inc. Tensor-Based Deep Relevance Model for Search on Online Social Networks
CN109101537A (en) * 2018-06-27 2018-12-28 北京慧闻科技发展有限公司 More wheel dialogue data classification methods, device and electronic equipment based on deep learning
CN109145292A (en) * 2018-07-26 2019-01-04 黑龙江工程学院 Paraphrasing text depth Matching Model construction method and paraphrasing text Matching Method of Depth
CN109086423A (en) * 2018-08-08 2018-12-25 北京神州泰岳软件股份有限公司 A kind of text matching technique and device
CN109344404A (en) * 2018-09-21 2019-02-15 中国科学技术大学 The dual attention natural language inference method of context aware
CN109471945A (en) * 2018-11-12 2019-03-15 中山大学 Medical file classification method, device and storage medium based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周伟杰: ""面向问答领域的语义相关性计算的研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457444A (en) * 2019-08-14 2019-11-15 山东浪潮人工智能研究院有限公司 A kind of sentence of same meaning conversion method based on depth text matches
CN110705283A (en) * 2019-09-06 2020-01-17 上海交通大学 Deep learning method and system based on matching of text laws and regulations and judicial interpretations
CN110795934A (en) * 2019-10-31 2020-02-14 北京金山数字娱乐科技有限公司 Sentence analysis model training method and device and sentence analysis method and device
CN110795934B (en) * 2019-10-31 2023-09-19 北京金山数字娱乐科技有限公司 Sentence analysis model training method and device and sentence analysis method and device
CN111027320A (en) * 2019-11-15 2020-04-17 北京三快在线科技有限公司 Text similarity calculation method and device, electronic equipment and readable storage medium
CN110969006B (en) * 2019-12-02 2023-03-21 支付宝(杭州)信息技术有限公司 Training method and system of text sequencing model
CN110969006A (en) * 2019-12-02 2020-04-07 支付宝(杭州)信息技术有限公司 Training method and system of text sequencing model
CN111368903A (en) * 2020-02-28 2020-07-03 深圳前海微众银行股份有限公司 Model performance optimization method, device, equipment and storage medium
CN111368903B (en) * 2020-02-28 2021-08-27 深圳前海微众银行股份有限公司 Model performance optimization method, device, equipment and storage medium
CN111898362A (en) * 2020-05-15 2020-11-06 联想(北京)有限公司 Data processing method and device
CN112560427A (en) * 2020-12-16 2021-03-26 平安银行股份有限公司 Problem expansion method, device, electronic equipment and medium
CN112560427B (en) * 2020-12-16 2023-09-22 平安银行股份有限公司 Problem expansion method, device, electronic equipment and medium
CN113361259A (en) * 2021-06-04 2021-09-07 浙江工业大学 Service flow extraction method
CN113935329A (en) * 2021-10-13 2022-01-14 昆明理工大学 Asymmetric text matching method based on adaptive feature recognition and denoising
CN114065729A (en) * 2021-11-16 2022-02-18 神思电子技术股份有限公司 Text sorting method based on deep text matching model

Also Published As

Publication number Publication date
CN110019685B (en) 2021-08-20

Similar Documents

Publication Publication Date Title
CN110019685A (en) Depth text matching technique and device based on sequence study
CN109992648A (en) The word-based depth text matching technique and device for migrating study
Tan et al. Lstm-based deep learning models for non-factoid answer selection
KR102194837B1 (en) Method and apparatus for answering knowledge-based question
CN109992788A (en) Depth text matching technique and device based on unregistered word processing
CN108681574B (en) Text abstract-based non-fact question-answer selection method and system
CN103218436B (en) A kind of Similar Problems search method and device that merges class of subscriber label
CN104598611B (en) The method and system being ranked up to search entry
CN109241294A (en) A kind of entity link method and device
CN108509411A (en) Semantic analysis and device
CN106844658A (en) A kind of Chinese text knowledge mapping method for auto constructing and system
CN109062902B (en) Text semantic expression method and device
Mazumder et al. Towards a continuous knowledge learning engine for chatbots
US20220318317A1 (en) Method for disambiguating between authors with same name on basis of network representation and semantic representation
CN113282711B (en) Internet of vehicles text matching method and device, electronic equipment and storage medium
CN109829045A (en) A kind of answering method and device
KR20210070904A (en) Method and apparatus for multi-document question answering
CN110083676B (en) Short text-based field dynamic tracking method
CN113934835A (en) Retrieval type reply dialogue method and system combining keywords and semantic understanding representation
CN107122378B (en) Object processing method and device and mobile terminal
CN114372454A (en) Text information extraction method, model training method, device and storage medium
CN111325015B (en) Document duplicate checking method and system based on semantic analysis
Ye et al. A sentiment based non-factoid question-answering framework
CN108256030A (en) A kind of degree adaptive Concept Semantic Similarity computational methods based on ontology
CN112632250A (en) Question and answer method and system under multi-document scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20190904

Address after: Room 630, 6th floor, Block A, Wanliu Xingui Building, 28 Wanquanzhuang Road, Haidian District, Beijing

Applicant after: China Science and Technology (Beijing) Co., Ltd.

Address before: Room 601, Block A, Wanliu Xingui Building, 28 Wanquanzhuang Road, Haidian District, Beijing

Applicant before: Beijing Shenzhou Taiyue Software Co., Ltd.

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 230000 zone B, 19th floor, building A1, 3333 Xiyou Road, hi tech Zone, Hefei City, Anhui Province

Applicant after: Dingfu Intelligent Technology Co., Ltd

Address before: Room 630, 6th floor, Block A, Wanliu Xingui Building, 28 Wanquanzhuang Road, Haidian District, Beijing

Applicant before: DINFO (BEIJING) SCIENCE DEVELOPMENT Co.,Ltd.

GR01 Patent grant
GR01 Patent grant