CN108920622A

CN108920622A - A kind of training method of intention assessment, training device and identification device

Info

Publication number: CN108920622A
Application number: CN201810694995.4A
Authority: CN
Inventors: 符文君; 吴友政
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2018-06-29
Filing date: 2018-06-29
Publication date: 2018-11-30
Anticipated expiration: 2038-06-29
Also published as: CN108920622B

Abstract

A kind of training method of intention assessment, device and identification device, method include：Obtain corpus text vector corresponding with the corpus in corpus library text；Construct the associated losses function formula of training pattern；Obtain training data；It is mapped to corresponding corpus text vector after carrying out cutting processing to training data, is denoted as training vector；Training sample vector is predicted using training pattern, according to prediction result, calculates the loss function value of each training pattern；Loss function value based on each model calculates associated losses functional value, judges whether associated losses functional value is less than given threshold, if so, training terminates, if not, updating each model parameter, continues repetitive exercise.So as to improve the generalization ability of intention assessment as far as possible, semantic ambiguity and Fault-Tolerant Problems are solved.

Description

A kind of training method of intention assessment, training device and identification device

Technical field

The present invention relates to technical field of intelligence, and in particular to a kind of training side of the intention assessment based on multi-task learning Method, training device and identification device.

Background technique

" intention assessment " refers to one section inputted to user for expressing the information of query demand, determines affiliated intention Classification.Current intention assessment technology is mainly used in search engine, interactive system etc., can specifically be divided into based on mould Plate/dictionary and method based on Supervised classification excavate specific meaning based on template/dictionary method from user's history input Artwork version/dictionary, if user's input matches with the word in template/dictionary of corresponding classification, then it is assumed that the input belongs to this It is intended to classification；Method based on Supervised classification is then based on history input data building intent classifier model, prediction user's input It is intended to classification.Applicant it has been investigated that, defect existing for technology essentially consists in following several respects at present：

1., generalization ability, the method based on template is limited to template and dictionary coverage rate problem, based on Supervised classification Method is then limited to training corpus data scale and data quality problem.

2., ambiguity problem, semantic missing and Fault-Tolerant Problems, often there is imperfect, semantic missing in short text and input is wrong Accidentally the problems such as, e.g., user's input, " play and after the night of bear ", are actually intended to search " the midnight seraglio of Teddy bear ".

In addition, some prior arts attempt the method based on multi-task learning also to classify.However this method is also deposited In two problems：Firstly, this method be based on other tasks training obtain text vector, then with the text vector of goal task into After row splicing, the classifier of this task of retraining, the error of other tasks is possible to bring a negative impact this task；Secondly, If other tasks are unrelated with current task, a large amount of unrelated external informations are introduced, classification results may be brought disturb instead It is dynamic.

Therefore, the missing of semanteme present in intention assessment, fault-tolerant, ambiguity and evolvement problem how are solved, this field is become Technical staff's one of the technical problems that are urgent to solve.

Summary of the invention

In view of this, the embodiment of the invention provides a kind of training methods of intention assessment based on multi-task learning, instruction Practice device and identification device, to solve model generalization ability when intention assessment, ambiguity problem and semantic Fault-Tolerant Problems.

To achieve the above object, the embodiment of the present invention provides the following technical solutions：

A kind of training method of intention assessment, including：

Corpus in corpus library text is mapped to semantic space, obtain and the low-dimensional corresponding with corpus it is dense to Amount, is denoted as corpus text vector；

Construct the associated losses function formula of the training pattern, the training pattern includes intention assessment model, similar Short text generates model and entity recognition model；

Obtain the training data for being directed to the training pattern；

Cutting processing is carried out to the training data, will treated training data is mapped to corresponding corpus library text to Amount, is denoted as training vector；

The training vector is input to the training pattern, prediction result and the training based on training pattern output The intention assessment model is calculated in the corresponding legitimate reading of vector, similar short texts generate model and entity recognition model Loss function value；

It brings the loss function value of each model into the associated losses function formula and obtains associated losses functional value；

Judge whether the associated losses functional value is less than setting value, if not, adjusting the intention assessment model, similar Short text generates the training parameter of model and entity recognition model, to reduce the loss function value of model, continues repetitive exercise, such as Fruit is that training terminates.

Preferably, in the training method of above-mentioned intention assessment, the associated losses function formula of the training pattern is constructed, is wrapped It includes：

Construct intention assessment model, similar short texts generate the associated losses function formula of model and entity recognition model： Loss_total=α * loss_{intent_recognition}+β*loss_{sim_query_generation}+γ*loss_{entity_recognition}, wherein Described α, β and γ are preset loss weight factor, the loss_{intent_recognition}For the loss function of intention assessment model, The loss_{entity_recognition}For the loss function of entity recognition model, the loss_{sim_query_generation}For Similar Text Generate the loss function of model.

Preferably, in the training method of above-mentioned intention assessment, the corpus by corpus library text is mapped to semantic sky Between, it obtains and the dense vector of low-dimensional corresponding with corpus, including：

Corpus in corpus data library is carried out to the cutting of the corresponding granularity of word level, word level or phonetic level；

The text after cutting is trained based on neural network model, is denoted as the dense vector of low-dimensional, it is described low Tieing up dense vector includes word vector, term vector or phonetic vector.

Preferably, it in the training method of above-mentioned intention assessment, is modeled to obtain intention assessment for intention assessment task Model, including：

It is modeled using LSTM model for intention assessment task, obtains intention assessment model, the intention assessment mould The input of type is inquiry query, is exported to be intended to class label.

Preferably, it in the training method of above-mentioned intention assessment, is modeled to obtain similar short texts for similar short texts Model is generated, including：

It is built to obtain similar short texts generation model for similar short texts using Seq2Seq model, it is described similar short The input of text generation model is user input query query, is exported as similar short text.

Preferably, it in the training method of above-mentioned intention assessment, is modeled to obtain Entity recognition for Entity recognition task Task model, including：

More disaggregated models are constructed based on convolutional neural networks, using more disaggregated models as Entity recognition task model Entity recognition task is trained based on training data, the training data is：Included in text comprising required identification Entity context text, mode input be the short text comprising entity, export as entity type label.

A kind of training device of intention assessment, including：

Corpus vector training unit, for the corpus in corpus library text to be mapped to semantic space, obtain with it is described with The dense vector of the corresponding low-dimensional of corpus, is denoted as corpus text vector；

Model storage unit, for being stored with intention assessment model, similar short texts generate model and entity recognition model；

Associated losses function formula storage unit, for storing the associated losses function formula of training pattern, the training Model includes intention assessment model, similar short texts generation model and entity recognition model；

Collecting training data unit, for obtaining the training data for being directed to the training pattern；

Training vector acquisition unit, for carrying out cutting processing to the training data, by treated, training data reflects It is mapped to corresponding corpus text vector, is denoted as training vector；

Loss function value computing unit is based on training pattern for the training vector to be input to the training pattern The prediction result of output and the corresponding legitimate reading of the training vector, are calculated the intention assessment model, similar short essay The loss function value of this generation model and entity recognition model brings the loss function value of each model into the associated losses letter Number formula obtains associated losses functional value；

Parameter adjustment unit, for judging whether the associated losses functional value is less than setting value, if not, described in adjustment Intention assessment model, similar short texts generate the training parameter of model and entity recognition model, to reduce the loss function of model Value continues repetitive exercise, if so, training terminates.

Preferably, in the training device of above-mentioned intention assessment, the associated losses function formula is：

Loss_total=α * loss_{intent_recognition}+β*loss_{sim_query_generation}+γ* loss_{entity_recognition}, wherein described α, β and γ are preset loss weight factor, the loss_{intent_recognition}For The loss function of intention assessment model, the loss_{entity_recognition}It is described for the loss function of entity recognition model loss_{sim_query_generation}The loss function of model is generated for Similar Text.

Preferably, in the training device of above-mentioned intention assessment, the corpus vector training unit is specifically used for：

Preferably, in the training device of above-mentioned intention assessment, the intention assessment model that is stored in the model storage unit For：

It is modeled using LSTM model for intention assessment task, obtained intention assessment model, the intention assessment The input of model is inquiry query, is exported to be intended to class label；

The similar short texts stored in the model storage unit generate model：

It is built to obtain similar short texts generation model for similar short texts using Seq2Seq model, it is described similar short The input of text generation model is user input query query, is exported as similar short text；

The Entity recognition task model stored in the model storage unit is：

Based on more disaggregated models of convolutional neural networks building, the training data of more disaggregated models is：Comprising required The context text of entity included in the text of identification, mode input are the short text comprising entity, are exported as entity class Type label.

A kind of intention assessment equipment, including memory and processor；

The intention assessment mould that the training method training of above-mentioned any one intention assessment obtains is stored in the memory Type, similar short texts generate model and entity recognition model, and the processor is used for when getting user query query, adjust With and execute the intention assessment model, similar short texts generate model and entity recognition model.

Based on the above-mentioned technical proposal, above scheme provided in an embodiment of the present invention, can be based on the mode of multi-task learning Input text is handled for intention assessment model, similar short texts generation model and entity recognition model using obtaining, Relevant linguistry can more effectively be learnt, the generalization ability of intention assessment model is improved, can solve to be intended to as far as possible Present in identification the problem of semantic ambiguity.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.

Fig. 1 is a kind of flow diagram of the training method of intention assessment disclosed in the embodiment of the present application；

Fig. 2 is the example based on LSTM model construction intention assessment model；

Fig. 3 is a kind of structural schematic diagram of the training device of intention assessment provided by the present application；

Fig. 4 is a kind of structural schematic diagram of intention assessment device disclosed in the embodiment of the present application.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

When being directed in the prior art to input text progress intention assessment, often there is the problem of ambiguity, the application is public The training method for having opened a kind of intention assessment, referring to Fig. 1, this method may include：

Step S101：Corpus in expectation database is trained to obtain text vector corresponding with corpus；

Specifically, the corpus in corpus library text is mapped to semantic space in this step, obtain and described and corpus pair The dense vector of the low-dimensional answered, is denoted as corpus text vector, and the concrete mode of the dense vector of low-dimensional can be word vector, word Vector, phonetic vector, in training, used training pattern can be using models such as word2vec.Wherein, the corpus number The corpus such as information, click logs and speech recognition result information, these languages are inputted according to the history text for being stored with user in library Material is specially textual form, wherein text entry information is the input data that user is keyed in by input equipment, the click Log refers to that the linked contents that user is clicked, such as user scan for, when being clicked by mouse and opening a link, The content of click and corresponding query are stored in click logs, the voice messaging be user by phonetic search or into Voice content when other interactive voices of row, wherein voice messaging, which can first pass through speech recognition technology and be converted to, to match It is being stored in the corpus data library after text as voice corpus.Specifically, the corpus data library can be by default Data grabber using corresponding corpus is grabbed in log by system each in electronic equipment, for example, the corpus data library In corpus can be and applied by data grabber from search query log, click logs, question and answer knowledge base title, microblogging, language It is acquired in the magnanimity short message data such as sound conversation log, the history grabbed is inputted into text information, click logs information And voice messaging etc. is saved to the corpus data library.

This step specifically includes following steps：Firstly, some text (corpus) in corpus data library is first carried out Then corresponding word level/word level/phonetic level granularity cutting is trained based on neural network model, after cutting Text representation is the dense real vector of low-dimensional.For example, text is " Chinese people ", at this point, based on character/word/phonetic to text After carrying out cutting, available " in/state/people/people ", " China/people ", " zhongguo/renmin ".It is obtained in above scheme Word vector corresponding with text is to obtain phonetic vector corresponding with text in order to solve the problems, such as unregistered word, be to solve The false transitions problem of compliant speech identification.Certainly, it stores and extracts for the ease of data, the application can be with a matrix type The corresponding corpus text vector of each corpus is stored, specifically, being based on after obtaining the dense vector of the corresponding low-dimensional of corpus Word2vec model each dense vector of low-dimensional corresponding to corpus is combined, and obtains the corresponding feature vector square of the text Battle array.For example, what is selected when specific implementation is the word vector of 100 dimensions, then after training, should " in/state/people/people " obtain one The word vector matrix of a corresponding 4*100, the vector of the corresponding individual character of every row of matrix, that each column indicates is the every of word vector It is one-dimensional, it is as follows：

" in " [- 0.1111,0.111 ..., 0.1233]

" state " [0.5324,0.3678 ..., -0.1111]

" people " [- 0.1111,0.1915 ..., 0.4996]

" people " [- 0.1111,0.8113 ..., 0.3267].

Step S102：The associated losses function formula of the training pattern is constructed, the training pattern includes intention assessment Model, similar short texts generate model and entity recognition model；

Before the associated losses function formula for constructing the training pattern, it is also necessary to model for each task To training pattern, specifically：

It is modeled respectively for intention assessment task, similar short texts task and Entity recognition task, obtains being intended to know Other model, similar short texts generate model and entity recognition model；

For intention assessment task, when being trained to intention assessment task, training data is the inquiry of user's input Query and intention class label corresponding with the inquiry query are based on after being segmented by inquiry query first or divide word The training result of step S101 obtains each word or word is mapped to corresponding vector, is then mapped to each word or word corresponding Institute's directed quantity is averaging, and is obtained the corresponding vector of inquiry query, is denoted as V1, then carries out entity knowledge to the inquiry query Not, check whether inquiry query matches the AD HOC in preset regular pattern table, such as：" playing fellow No.9's door ", fellow No.9 Door is identified as album by the preset regular pattern table, matches the mode in the regular pattern table：Album is played, then Inquiry query obtains corresponding K dimensional vector V2, wherein K is the number of modes in regular pattern table, if i-th dimension value is 1, then it represents inquiry query and matches i-th of mode in regular pattern table, value is that 0 representative does not match regular pattern Mode in table.Then using V1 and V2 as the input of default neural network model.In the present embodiment, the structure packet of neural network Include input layer, lstm layers, layer layers of dense and layer layers of skip.For details, reference can be made to Fig. 2 examples.

In the present embodiment, for example, for video search engine, the inquiry query of input is " playing fellow No.9's door ", then defeated Intention class label out is " PLAY_VIDEO ", and the inquiry query of input is " downloading king's honor ", then the intention class exported Distinguishing label is " DOWNLOAD_GAME ", and the inquiry query of input is " member supplements with money ", then the intention labels exported are “PURCHASE”。

Task is generated for similar short texts, training data is similar short texts pair, specifically includes following three classes：①, Query is inquired, the inquiry query of same document is clicked with inquiry query, 2., inquires query, same session's looks into Query is ask, 3., inquires query, and the title that inquiry query is clicked.It is built generating task to similar short texts When mould, modeled to obtain similar short texts generation model, the similar short texts for similar short texts based on Seq2Seq model The input for generating model is user query query, is exported as short text similar with the inquiry query.

The target of Seq2seq model is based on given sequence X, output sequence Y, and loss function value is calculated using cross entropy It arrives.Seq2seq model is made of two Recognition with Recurrent Neural Network of encoder and decoder, and encoder neural network will input sequence Column are converted to the vector of a regular length, and the vector that decoder neural network is generated according to encoder neural network generates One new sequence.When model training, the vector of encoder neural network and other task sharings.When predicting text, such as： The input of model is：" how buying member ", encoder neural network by this section of text one by one word or one by one word sequentially input to Network, is converted into the vector of a regular length, and the vector of the regular length is converted to new text by decoder neural network This output, such as output " how being supplemented with money to member ".

For Entity recognition task, more disaggregated models are constructed as entity recognition model based on convolutional neural networks, are used More disaggregated models are based on training data and are trained to Entity recognition task, the training data of the entity recognition model For：Context text comprising the entity, mode input are the short text comprising entity, are exported as entity type label.Training The specific source of data can be：The internets such as encyclopaedia disclose corpus resource, or by the application related personnel language that mark constructs by hand Expect data.

In the present embodiment, convolutional neural networks framework includes following several layers of：Vector search layer, linear transformation layer 1, Sigmoid layers, linear transformation layer 2, whole sentence analysis layer.When being trained to the convolutional neural networks, neural network model Based on stochastic gradient descent algorithm, it is trained using cross entropy loss function value.In the entity type mark of prediction input text When label, for example, in moment t, system Chinese character to be processed is " north " for input " Beijing ", it will be described by vector search layer " north " is mapped as real vector, is transferred to linear transformation layer, and after linear layer and sigmoid layers of processing, system is to " north " All possible labels of correspondence give a mark, the maximum label of Marking Probability, score value is higher, and mark probability is bigger.In subsequent time T+1, system then handle next Chinese character " capital " in sentence.In sentence analytic sheaf, for the text generation score value net of processing Network, wherein the node of t column is the score value of the corresponding all entity type labels of t moment Chinese character to be processed, the node of t+1 column Line identifies transition probability between the node of t column, for describing the possibility of transformation between label, is finally based on viterbi Algorithm finds the highest path of overall scores, as final flag sequence in a network.For example, " Beijing " corresponding reality Body type label path is " B-LOC I-LOC I-LOC ".

After the completion of training pattern building, the loss function value of each training pattern is constructed, wherein the damage of intention assessment model Mistake functional value is loss_{intent_recognition}, the loss function value of entity recognition model is loss_{entity_recognition}, similar text The loss function value of this generation model isThe loss_{intent_recognition}、 loss_{entity_recognition}And loss_{sim_query_generation}Value can when being trained to training pattern using mean square deviation or Person's Cross-Entropy Method obtains.For each loss function value assign one specifically loss weight factor, will assign loss weight factor it The sum of three loss function values afterwards generate the joint of model and entity recognition model as intention assessment model, similar short texts Loss function value, that is, pass through associated losses function formula：

Loss_total=α * loss_{intent_recognition}+β*loss_{sim_query_generation}+γ* loss_{entity_recognition}Calculate to obtain associated losses functional value loss_total, wherein wherein α, β and γ are respectively loss_{intent_recognition}、loss_{entity_recognition}And loss_{sim_query_generation}Loss weight factor, described α, β and Between (0,1), occurrence can voluntarily be adjusted the value of γ according to user demand.That is, three models respectively have loss function, If x is equivalent to loss at this point, the loss function of intention assessment model is f1 (x) for user's input_{intent_recognition}, similar The loss function of short text model is f2 (x), is equivalent to loss_{entity_recognition}, the loss function of entity recognition model is f3 (x), it is equivalent to loss_{sim_query_generation}, then associated losses function is the linear weighted function of 3 loss functions, a*f1 (x)+c* F2 (x)+b*f3 (x), wherein a is equivalent to α, and b is equivalent to β, and c is equivalent to γ.

Step S103：Obtain the training data for being directed to the training pattern；

Wherein, the training data includes generating model and Entity recognition mould for intention assessment model, similar short texts The training data of type；

In this step, the training data is some data in preset training data set, in specific data Appearance can voluntarily be selected according to user demand, using the training data as the input text of above-mentioned each model, and each instruction Practice the corresponding actual similar short texts of data and class label is known.

Step S104：Cutting processing is carried out to the training data；

This step specifically includes：To the intention assessment model, similar short texts generation model and entity recognition model Training data is segmented, word or words mixing cutting is divided to handle；

Step S105：By treated, training data is mapped to corresponding corpus text vector, is denoted as training vector；

Word or words mixing cutting are segmented or are divided to the input text of each model, when input text is Chinese text This when, can choose the participle model based on hidden Markov, character or condition random field and carry out cutting to input text, when defeated When to enter text be English text, it may be selected to carry out cutting as separator based on punctuate and space, if embodiment selection divides Word, input text are：" I wants to see fellow No.9's door ", the input text are split as " I/thinking/sees/fellow No.9's door ".Such as inputting text is " I want to play game ", which is split as " I/want/to/play/game "；If embodiment is selected Select a point word, then press character cutting, such as can will input text " I want see fellow No.9 door " cutting be " I/thinking/see/it is old/nine/ Door ", in addition, also can choose words mixing cutting according to needs are applied, for example, cutting to the Chinese in input text by character The mode of dividing carries out cutting to input text, carries out cutting to text is inputted by word segmentation mode to the English in input text, such as： " I sees/billions at/thinking/".

After the training data to each mode input carries out cutting processing, the training data after cutting is mapped to step Corpus text vector obtained in S101, detailed process are：Obtain corpus institute identical with training data in corpus data library Corresponding corpus data library text vector, the corpus data library text vector are training vector corresponding with training data, It is specifically as follows word vector, term vector or phonetic vector, if corpus text vector is phonetic vector, by training data Son or word are converted to phonetic, then re-map to obtain corresponding phonetic vector.Above three model can share a set of character/word/spelling Sound vector.

Step S106：The training vector is input to the training pattern, the prediction result based on training pattern output The intention assessment model is calculated in legitimate reading corresponding with the training vector, similar short texts generate model and reality The loss function value of body identification model；

In this step, when each model of each training, circulation is gone through all over each task, randomly selects a part training number According to corresponding training vector as mode input, output result and the corresponding legitimate reading meter of the training data according to model The loss function value of each model is calculated, specifically, in the present embodiment, for each model, result based on model prediction and every The corresponding legitimate reading of a training data carries out one-hot processing, calculates cross entropy loss function, obtains the damage of each model Lose functional value.

Step S107：Joint damage is calculated in loss function value based on the associated losses function formula and each model Lose functional value；

In this step, each loss function value that step S107 is calculated brings the associated losses function formula into, The associated losses functional value can be obtained.

Step S108：Judge whether the associated losses functional value is less than setting value, if so, training terminates, if not, Execute step S109；

Step S109：The intention assessment model is updated based on Back Propagation Algorithm, similar short texts generate model and reality The parameter of body identification model, the input vector of three model sharings continue to execute step as model parameter also synchronized update S103 is iterated training.

In the above scheme, input text is handled by way of multi-task learning, can be more effectively learnt Relevant linguistry, the generalization ability of raising intention assessment model, as far as possible semanteme missing present in solution intention assessment, Fault-tolerant, ambiguity and it is extensive the problems such as, specifically, the intention assessment model, similar short essay can be updated based on back-propagation algorithm When the parameter of this generation model and entity recognition model.

Corresponding to the above method, disclosed herein as well is a kind of training devices of intention assessment, referring to Fig. 3 comprising：

Corpus vector training unit 01, for the corpus in corpus library text to be mapped to semantic space, obtain with it is described The dense vector of low-dimensional corresponding with corpus, is denoted as corpus text vector；

Model storage unit 02, for being stored with intention assessment model, similar short texts generate model and Entity recognition mould Type；

Associated losses function formula storage unit 03, for storing the associated losses function formula of training pattern, the instruction Practicing model includes intention assessment model, similar short texts generation model and entity recognition model；

Collecting training data unit 04, for obtaining the training data for being directed to the training pattern；

Training vector acquisition unit 05 will treated training data for carrying out cutting processing to the training data It is mapped to corresponding corpus text vector, is denoted as training vector；

Loss function value computing unit 06, for the training vector to be input to the training pattern, based on training mould The prediction result and the corresponding legitimate reading of the training vector of type output, are calculated the intention assessment model, similar short The loss function value of text generation model and entity recognition model brings the loss function value of each model into the associated losses Function formula obtains associated losses functional value；

Parameter adjustment unit 07, for judging whether the associated losses functional value is less than setting value, if not, adjustment institute The training parameter of intention assessment model, similar short texts generation model and entity recognition model is stated, to reduce the loss letter of model Numerical value continues repetitive exercise, if so, training terminates.

It corresponds to the above method, the history text input information of user is stored in the corpus data library, clicks day Will and voice messaging can be based on information above training corpus text vector, and the training dataset of three models of construction.

It corresponds to the above method, when the corpus vector training unit training vector, is specifically used for：

It corresponds to the above method, the intention assessment model stored in the model storage unit is：

The similar short texts stored in the model storage unit generate model：

The Entity recognition task model stored in the model storage unit is：

Corresponding to the training method of above-mentioned intention assessment, disclosed herein as well is a kind of intention assessment equipment, referring to fig. 4, For intention assessment device structure schematic diagram disclosed in the embodiment of the present application, which may include：

Memory 100 and processor 200；

The intention assessment equipment further includes communication interface 300 and communication bus 400, wherein memory 100, processing Device 200 and the communication of communication interface 300 realize mutual communication by communication bus 400.

The memory 100 is for storing program code；Said program code includes computer operation instruction.Specifically, The meaning that the training method training of intention assessment disclosed in the above-mentioned any one embodiment of the application obtains is stored in the memory Figure identification model, similar short texts generate the program code of model and entity recognition model.

Memory 100 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.

The processor 200 can be a central processor CPU or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be arranged to implement the one of the embodiment of the present invention A or multiple integrated circuits.The processor 200 is for calling and executing said program code.Specifically, the processor is used In when getting user query query, call and execute the intention assessment model, similar short texts generate model and entity Identification model.

For convenience of description, it is divided into various modules when description system above with function to describe respectively.Certainly, implementing this The function of each module can be realized in the same or multiple software and or hardware when application.

All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system or For system embodiment, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to method The part of embodiment illustrates.System and system embodiment described above is only schematical, wherein the conduct The unit of separate part description may or may not be physically separated, component shown as a unit can be or Person may not be physical unit, it can and it is in one place, or may be distributed over multiple network units.It can root According to actual need that some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.Ordinary skill Personnel can understand and implement without creative efforts.

Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.

The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.

It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims

1. a kind of training method of intention assessment, which is characterized in that including：

Corpus in corpus library text is mapped to semantic space, is obtained and the dense vector of low-dimensional corresponding with corpus, note For corpus text vector；

The associated losses function formula of training pattern is constructed, the training pattern includes intention assessment model, similar short texts life At model and entity recognition model；

Obtain the training data for being directed to the training pattern；

Cutting processing is carried out to the training data, by treated, training data is mapped to corresponding corpus text vector, It is denoted as training vector；

The training vector is input to the training pattern, prediction result and the training vector based on training pattern output The intention assessment model is calculated in corresponding legitimate reading, similar short texts generate the damage of model and entity recognition model Lose functional value；

Judge whether the associated losses functional value is less than setting value, if not, adjusting the intention assessment model, similar short essay The training parameter of this generation model and entity recognition model continues repetitive exercise to reduce the loss function value of model, if It is that training terminates.

2. the training method of intention assessment according to claim 1, which is characterized in that construct the joint of the training pattern Loss function formula, including：

Construct intention assessment model, similar short texts generate the associated losses function formula of model and entity recognition model：loss_ Total=α * loss_{intent_recognition}+β*loss_{sim_query_generation}+γ*loss_{entity_recognition}, wherein the α, β and γ is preset loss weight factor, the loss_{intent_recognition}It is described for the loss function of intention assessment model loss_{entity_recognition}For the loss function of entity recognition model, the loss_{sim_query_generation}For Similar Text generation The loss function of model.

3. the training method of intention assessment according to claim 1, which is characterized in that the language by corpus library text Material is mapped to semantic space, obtains and the dense vector of low-dimensional corresponding with corpus, including：

The text after cutting is trained based on neural network model, is denoted as the dense vector of low-dimensional, the low-dimensional is thick Close vector includes word vector, term vector or phonetic vector.

4. the training method of intention assessment according to claim 1, which is characterized in that built for intention assessment task Mould obtains intention assessment model, including：

It is modeled using LSTM model for intention assessment task, obtains intention assessment model, the intention assessment model Input is inquiry query, is exported to be intended to class label.

5. the training method of intention assessment according to claim 1, which is characterized in that modeled for similar short texts It obtains similar short texts and generates model, including：

It is built to obtain similar short texts generation model, the similar short texts for similar short texts using Seq2Seq model The input for generating model is user input query query, is exported as similar short text.

6. the training method of intention assessment according to claim 1, which is characterized in that built for Entity recognition task Mould obtains Entity recognition task model, including：

More disaggregated models are constructed based on convolutional neural networks, are based on using more disaggregated models as Entity recognition task model Training data is trained Entity recognition task, and the training data is：It is real included in text comprising required identification The context text of body, mode input are the short text comprising entity, are exported as entity type label.

7. a kind of training device of intention assessment, which is characterized in that including：

Corpus vector training unit obtains and described and corpus for the corpus in corpus library text to be mapped to semantic space The corresponding dense vector of low-dimensional, is denoted as corpus text vector；

Associated losses function formula storage unit, for storing the associated losses function formula of training pattern, the training pattern Model and entity recognition model are generated including intention assessment model, similar short texts；

Training vector acquisition unit, for carrying out cutting processing to the training data, by treated, training data is mapped to Corresponding corpus text vector, is denoted as training vector；

Loss function value computing unit is exported for the training vector to be input to the training pattern based on training pattern Prediction result and the corresponding legitimate reading of the training vector, it is raw that the intention assessment model, similar short texts are calculated At the loss function value of model and entity recognition model, it is public to bring the loss function value of each model into the associated losses function Formula obtains associated losses functional value；

Parameter adjustment unit, for judging whether the associated losses functional value is less than setting value, if not, adjusting the intention Identification model, similar short texts generate the training parameter of model and entity recognition model, to reduce the loss function value of model, after Continuous repetitive exercise, if so, training terminates.

8. the training device of intention assessment according to claim 7, which is characterized in that the associated losses function formula For：

Loss_total=α * loss_{intent_recognition}+β*loss_{sim_query_generation}+γ*loss_{entity_recognition}, In, described α, β and γ are preset loss weight factor, the loss_{intent_recognition}For the loss letter of intention assessment model Number, the loss_{entity_recognition}For the loss function of entity recognition model, the loss_{sim_query_generation}For similar text The loss function of this generation model.

9. the training device of intention assessment according to claim 7, which is characterized in that the corpus vector training unit, It is specifically used for：

10. the training device of intention assessment according to claim 7, which is characterized in that deposited in the model storage unit The intention assessment model of storage is：

It is modeled using LSTM model for intention assessment task, obtained intention assessment model, the intention assessment model Input be inquiry query, export for be intended to class label；

The similar short texts stored in the model storage unit generate model：

It is built to obtain similar short texts generation model, the similar short texts for similar short texts using Seq2Seq model The input for generating model is user input query query, is exported as similar short text；

The Entity recognition task model stored in the model storage unit is：

Based on more disaggregated models of convolutional neural networks building, the training data of more disaggregated models is：Include required identification Text included in entity context text, mode input be the short text comprising entity, export as entity type mark Label.

11. a kind of intention assessment equipment, which is characterized in that including memory and processor；

The intention assessment that the training method training of claim 1-6 any one intention assessment obtains is stored in the memory Model, similar short texts generate model and entity recognition model, and the processor is used for when getting user query query, Call and execute the intention assessment model, similar short texts generate model and entity recognition model.