CN110083834A

CN110083834A - Semantic matches model training method, device, electronic equipment and storage medium

Info

Publication number: CN110083834A
Application number: CN201910332189.7A
Authority: CN
Inventors: 龚建
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2019-08-02
Anticipated expiration: 2039-04-24
Also published as: CN110083834B

Abstract

The application proposes a kind of semantic matches model training method, device, electronic equipment and storage medium.Wherein, this method comprises: obtaining the training data including multiple training samples, training sample includes: sequence pair；The type of sequence pair is, sequence of terms to or phrase sequence pair；For each training sample, feature vector pair is obtained to mapping processing is carried out to the sequence in training sample using preset mapping layer；According to the type of sequence pair, selects corresponding transmission function to feature vector to handling from preset multiple transmission functions, obtain prediction and matching degree；According to the prediction and matching degree of sequence pair in multiple training samples, the coefficient of mapping layer and preset multiple transmission functions is modified, obtains trained semantic matches model.As a result, by this semantic matches model training method, the cycle of training of semantic matches model is shortened, reduces trained cost.

Description

Semantic matches model training method, device, electronic equipment and storage medium

Technical field

This application involves computer application technology more particularly to a kind of semantic matches model training methods, device, electricity Sub- equipment and storage medium.

Background technique

The rapid development of internet provides the carrier that a completely new information is stored, processes, transmits and used for people, The network information has also rapidly become people and has obtained one of knowledge and the main channel of information.And the information resources of so scale exist While nearly all knowledge that the mankind are occupied is included, also brought to the user of resource how sufficiently to develop and The problem of utilization.Search engine is exactly to come into being under this demand, its assisted network user searches letter on the internet Breath.Specifically, search engine collects information according to certain strategy, with specific computer program from internet, right After information carries out tissue and processing, search service is provided for user, user is searched for into relevant information and shows user.

In web searches, web page title is recalled with the relevant matches of user search request for promoting search engine Sequence effect after search result has vital meaning.In the related technology, result is recalled in order to improve search engine Accuracy usually utilizes deep learning frame, bag of words, convolutional neural networks model and general regression neural is respectively trained Network model etc., to carry out relevant matches calculating to web page title and searching request.But the instruction of this semantic matches model The mode of white silk, model training time are long, at high cost.

Summary of the invention

Semantic matches model training method, device, electronic equipment and the storage medium that the application proposes, for solving correlation The training method of semantic matches model in technology, each model will be trained individually, and the model training time is long, at high cost Problem.

The semantic matches model training method that the application one side embodiment proposes, comprising: obtain training data, the instruction Practicing data includes: multiple training samples；The training sample includes: sequence pair；The type of the sequence pair is sequence of terms To or phrase sequence pair；For each training sample, using preset mapping layer to the sequence in the training sample into Row mapping processing, obtains feature vector pair；According to the type of the sequence pair, the selection pair from preset multiple transmission functions The transmission function answered, to handling, obtains prediction and matching degree to described eigenvector；According to sequence in the multiple training sample The prediction and matching degree of column pair, is modified the coefficient of the mapping layer and preset multiple transmission functions, is trained Semantic matches model.

The semantic matches model training apparatus that the application another aspect embodiment proposes, comprising: module is obtained, for obtaining Training data, the training data include: multiple training samples；The training sample includes: sequence pair；The sequence pair Type is, sequence of terms to or phrase sequence pair；Mapping block, for being directed to each training sample, using preset mapping Layer, to mapping processing is carried out, obtains feature vector pair to the sequence in the training sample；Processing module, for according to The type of sequence pair selects corresponding transmission function to described eigenvector to from carrying out from preset multiple transmission functions Reason, obtains prediction and matching degree；Correction module, for the prediction and matching degree according to sequence pair in the multiple training sample, to institute The coefficient for stating mapping layer and preset multiple transmission functions is modified, and obtains trained semantic matches model.

The electronic equipment that the application another further aspect embodiment proposes comprising: memory, processor and it is stored in memory Computer program that is upper and can running on a processor, which is characterized in that the processor is realized as before when executing described program The semantic matches model training method.

The computer readable storage medium that the application another further aspect embodiment proposes, is stored thereon with computer program, It is characterized in that, foregoing semantic matches model training method is realized when described program is executed by processor.

The computer program product that the another aspect embodiment of the application proposes, the instruction in the computer program product are located When managing device execution, foregoing semantic matches model training method is realized.

Semantic matches model training method provided by the embodiments of the present application, device, electronic equipment, computer-readable storage medium Matter and computer program product, the available training data including multiple training samples, training sample includes: sequence pair, sequence Column pair type be, sequence of terms to or phrase sequence pair, and be directed to each training sample, using preset mapping layer pair Sequence in training sample obtains feature vector pair to mapping processing is carried out, later according to the type of sequence pair, from preset It selects corresponding transmission function to feature vector to handling in multiple transmission functions, obtains prediction and matching degree, and then basis The prediction and matching degree of sequence pair in multiple training samples, repairs the coefficient of mapping layer and preset multiple transmission functions Just, trained semantic matches model is obtained.Multiple transmission functions are obtained by using identical mapping layer respectively as a result, Prediction and matching degree, and then carried out according to coefficient of the corresponding prediction and matching degree of multiple transmission functions to multiple transmission functions Amendment shortens semantic matches model to realize using identical mapping layer while be trained to multiple transmission functions Cycle of training, reduce trained cost.

The additional aspect of the application and advantage will be set forth in part in the description, and will partially become from the following description It obtains obviously, or recognized by the practice of the application.

Detailed description of the invention

The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:

Fig. 1 is a kind of flow diagram of semantic matches model training method provided by the embodiment of the present application；

Fig. 2 is a kind of semantic matches model training block schematic illustration provided by the embodiment of the present application；

Fig. 3 is the flow diagram of another kind semantic matches model training method provided by the embodiment of the present application；

Fig. 4-1 is the schematic diagram for the semantic matches model that transmission function is BoW model；

Fig. 4-2 is the schematic diagram for the semantic matches model that transmission function is CNN model；

Fig. 4-3 is the schematic diagram for the semantic matches model that transmission function is GRNN model；

Fig. 5 is a kind of structural schematic diagram of semantic matches model training apparatus provided by the embodiment of the present application；

Fig. 6 is the structural schematic diagram of electronic equipment provided by the embodiment of the present application.

Specific embodiment

Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element.The embodiments described below with reference to the accompanying drawings are exemplary, It is intended for explaining the application, and should not be understood as the limitation to the application.

The embodiment of the present application is directed to the training method of semantic matches model in the related technology, and each model will be instructed individually Practice, model training time length, problem at high cost propose a kind of semantic matches model training method.

Semantic matches model training method provided by the embodiments of the present application, the available training including multiple training samples Data, training sample include: sequence pair, and the type of sequence pair is, sequence of terms to or phrase sequence pair, and for each Training sample obtains feature vector pair to mapping processing is carried out to the sequence in training sample using preset mapping layer, it Afterwards according to the type of sequence pair, select corresponding transmission function to feature vector to progress from preset multiple transmission functions Processing obtains prediction and matching degree, and then according to the prediction and matching degree of sequence pair in multiple training samples, to mapping layer and in advance If the coefficients of multiple transmission functions be modified, obtain trained semantic matches model.As a result, by using identical Mapping layer obtains the prediction and matching degree of multiple transmission functions respectively, and then according to the corresponding prediction of multiple transmission functions The coefficient of multiple transmission functions is modified with degree, to realize using identical mapping layer simultaneously to multiple transmitting letters Number is trained, and is shortened the cycle of training of semantic matches model, is reduced trained cost.

Below with reference to the accompanying drawings to semantic matches model training method provided by the present application, device, electronic equipment, storage medium And computer program product is described in detail.

Fig. 1 is a kind of flow diagram of semantic matches model training method provided by the embodiment of the present application.

As shown in Figure 1, the semantic matches model training method, comprising the following steps:

Step 101, training data is obtained, the training data includes: multiple training samples；The training sample includes: Sequence pair；The type of the sequence pair is, sequence of terms to or phrase sequence pair.

In the embodiment of the present application, include a large amount of training samples in training data, include sequence pair in each training sample. Wherein, the sequence that sequence centering includes can be word, be also possible to phrase, sentence etc., i.e. the type of sequence pair can be word Word order arrange to or phrase sequence pair, that is to say, that if the type of sequence pair be sequence of terms pair, the sequence to include two A parallel expression；If the type of sequence pair is phrase sequence pair, the sequence is to including two corresponding phrases.

Preferably, the type of the training sample in training data can also include positive sample and negative sample.Wherein, positive sample Refer to including two sequences pair of sequence centering between the higher training sample of correlation, negative sample refer to including sequence The lower training sample of correlation between two sequences pair of centering.In a kind of possible way of realization of the embodiment of the present application, The type of each training sample in training data can be the determination when obtaining training data.

As a kind of possible implementation, before being trained semantic matches model, need to obtain training first Data, to be trained according to the training data of acquisition to semantic matches model.It, can be from magnanimity under practical application scene User's search and click in behavioral data and obtain training data, i.e., training sample can be the searching request of user's input with And the title of search result that search engine is recalled according to the searching request that user inputs, that is to say, that include in training sample Sequence centering two sequences, respectively user input searching request and search result corresponding with the searching request Title.The type of training sample can be determined according to click data of the user to the title of search result.

Specifically, the searching request and search that can be inputted user are tied if user clicks the title of search result A One sequence pair of title composition of fruit A, and the training sample including the sequence pair is determined as positive sample；If user does not click on The title of search result B, then one sequence pair of title composition of the searching request and search result B that can input user, and Training sample including the sequence pair is determined as negative sample.

Step 102, for each training sample, using preset mapping layer to the sequence in the training sample to progress Mapping processing, obtains feature vector pair.

In the embodiment of the present application, after getting training data, it can first to the sequence in each training sample Sequence each in training sample handles corresponding participle mapping to progress word segmentation processing, and then using preset mapping layer, With the corresponding feature vector of each participle of determination, and then according to the corresponding feature vector of participle that sequence includes, each sequence is determined Corresponding feature vector is arranged, for example, the corresponding feature vector of multiple participles that can include by sequence is spliced, is made For the corresponding feature vector of the sequence；Alternatively, the corresponding feature vector of the multiple participles that can also include by sequence carries out Summation, as the corresponding feature vector of the sequence, to can be obtained after obtaining the corresponding feature vector of each sequence Sequence in each training sample is to corresponding feature vector pair.

Step 103, according to the type of the sequence pair, corresponding transmission function is selected from preset multiple transmission functions To described eigenvector to handling, prediction and matching degree is obtained.

In the embodiment of the present application, it can use preset mapping layer to be trained multiple semantics recognition models respectively, Multiple transmission functions can be preset, and utilize all training samples obtained by preset mapping layer and training data Feature vector pair is trained multiple transmission functions.

As a kind of possible implementation, when due to being trained to different transmission functions, required trained number According to type may also be different, therefore, can be according to the type of sequence pair in training sample, from preset multiple transmitting Selection is applicable in the transmission function of the type of sequence pair in function, i.e. sequence utilizes selection to the corresponding transmission function of type Transmission function all feature vectors corresponding to training data are to handling, with each feature vector of determination to corresponding pre- Matching degree is surveyed, i.e., each sequence is to corresponding prediction and matching degree.

Preferably, feature vector can be between two feature vectors of feature vector centering corresponding prediction and matching degree Cosine similarity.In actual use, can also according to actual needs and specific application scenarios, preset prediction and matching degree Calculation, the embodiment of the present application do not limit this.

As a kind of possible implementation, the type for the corresponding sequence pair of each training sample for including in training data can It can be different, i.e., the type of some training samples sequence pair for including is sequence of terms pair, the sequence that some training samples include The type of column pair is phrase sequence pair, to may be used also before selecting corresponding transmission function in preset multiple transmission functions According to the type for the sequence pair for including in training sample, to be grouped first to training sample, and then using the instruction after grouping Practice the corresponding feature vector pair of sample, multiple transmission functions are trained respectively.

For example, the type for the corresponding sequence pair of each training sample for including in training data have sequence of terms to it is short Word order column pair, then can be according to the type of sequence pair, by training sample difference sequence of terms to corresponding training sample and phrase Sequence is selected with sequence of terms to two groups of corresponding training sample, and from preset multiple transmission functions to corresponding transmitting Function, so using with sequence of terms to corresponding transmission function to sequence of terms to the feature vector of corresponding training sample To handling, to obtain sequence of terms to the prediction and matching degree of sequence pair in corresponding each training sample；And from preset It is selected in multiple transmission functions with phrase sequence to corresponding transmission function, and then utilized with phrase sequence to corresponding transmitting Function to phrase sequence to the feature vector of corresponding each training sample to handling, to obtain phrase sequence to corresponding The prediction and matching degree of sequence pair in each training sample.

Further, preset transmission function may include bag of words (Bag-of-words model, abbreviation BoW), Convolutional neural networks (Convolutional Neural Networks, abbreviation CNN) model, generalized regression nerve networks (general regression neural network, abbreviation GRNN) model.I.e. in a kind of possible reality of the embodiment of the present application In existing form, above-mentioned sequence of terms includes: bag of words to corresponding transmission function；Above-mentioned phrase sequence is to corresponding transmitting Function includes: bag of words, convolutional neural networks model and general regression neural network；Correspondingly, above-mentioned steps 103, May include:

If the type of the sequence pair is sequence of terms pair, using bag of words to described eigenvector to locating Reason, obtains prediction and matching degree；

If the type of the sequence pair be phrase sequence pair, be respectively adopted bag of words, convolutional neural networks model and General regression neural network, to handling, obtains prediction and matching degree to described eigenvector.

It, can if the type of the corresponding sequence pair of training sample is sequence of terms pair as a kind of possible implementation With using bag of words to the corresponding feature vector of training sample to handling, to determine sequence pair that training sample includes Prediction and matching degree；If the type of the corresponding sequence pair of training sample is phrase sequence pair, bag of words mould can be respectively adopted Type, convolutional neural networks model and general regression neural network to the corresponding feature vector of training sample to handling, To determine the prediction and matching degree of sequence pair that training sample includes.

As shown in Fig. 2, for a kind of semantic matches model training block schematic illustration provided by the embodiment of the present application.Wherein, Query is the searching request of user's input, and title is the title of search result that search engine is recalled, i.e. a query with One title constitutes a sequence pair, and cos_1 is using bag of words (bow) to query feature vector corresponding with title To handling, obtained prediction and matching degree；Cos_2 is using convolutional neural networks model (cnn) to query and title couples The feature vector answered is to handling, obtained prediction and matching degree；Cos_3 is using general regression neural network (grnn) to query feature vector corresponding with title to handling, obtained prediction and matching degree.

Step 104, it according to the prediction and matching degree of sequence pair in the multiple training sample, to the mapping layer and presets The coefficients of multiple transmission functions be modified, obtain trained semantic matches model.

Wherein, mapping layer and the coefficient of multiple transmission functions refer to that mapping layer and multiple transmission functions are corresponding each A parameter can make mapping layer and preset multiple biographies by amendment mapping layer and the coefficient of preset multiple transmission functions The performance of delivery function is more in line with expection.

In the embodiment of the present application, the prediction and matching of sequence pair in the multiple training samples that can include according to training data Degree, determine the prediction and matching degree of sequence pair in the corresponding positive sample of same subscriber searching request, in negative sample sequence pair it is pre- The difference between matching degree is surveyed, and then by widening sequence pair in the corresponding positive sample of same subscriber searching request as far as possible The mode of difference between prediction and matching degree, and the prediction and matching degree of sequence pair in negative sample, respectively to mapping layer and default The coefficients of multiple transmission functions be modified, to obtain trained semantic matches model.Specifically, formula can be passed through (1) the corresponding penalty values of preset multiple transmission functions are determined, and then are respectively corresponded according to preset multiple loss functions Penalty values, the coefficient of the coefficient to preset multiple transmission functions and mapping layer is modified respectively, trained to obtain Semantic matches model.

Wherein, Loss is penalty values；Query is searching request of the sequence to the user for including in training sample； Title_-Being includes corresponding negative example title in the negative sample of Query；Title₊Be include in the positive sample of Query it is corresponding just Example title；Similarity(Query,Title_-) it is Query and Title_-Between prediction and matching degree, that is, include Query with Title_-Training sample in sequence pair prediction and matching degree；Similarity(Query,Title₊) it is Query and Title₊It Between prediction and matching degree, that is, include Query and Title₊Training sample in sequence pair prediction and matching degree；Q and Query phase Together；D is the log for recording the click behavior of the title of the corresponding search result of Query and Query；A is constant.

It should be noted that in actual use, the value of constant a can be preset according to actual needs, the embodiment of the present application It does not limit this.For example, constant a can be preset as 0.1.

As a kind of possible implementation, the threshold value of Loss value can be preset, and each preset transmitting letter will be passed through The prediction and matching degree of sequence pair in multiple training samples that number obtains, substitutes into formula (1), with each preset transmitting letter of determination The corresponding Loss value of number, and judge whether the corresponding Loss value of each preset transmission function is less than preset threshold value respectively.

Specifically, if the corresponding Loss value of transmission function is less than threshold value, i.e., by the determining positive sample of the transmission function Difference in the prediction and matching degree and negative sample of sequence pair between the prediction and matching degree of sequence pair is larger, the property of the transmission function It can meet the requirements, then without being modified to the coefficient of the transmission function and mapping layer, that is, complete the instruction to the transmission function Practice；If the corresponding Loss value of transmission function be not less than threshold value, i.e., by the transmission function determine positive sample in sequence pair it is pre- The difference surveyed in matching degree and negative sample between the prediction and matching degree of sequence pair is smaller, and the performance of the transmission function, which is not met, to be wanted It asks, then the coefficient to the transmission function and mapping layer is needed to be modified, until the corresponding Loss value of the transmission function is less than threshold When value, that is, pass through the prediction of sequence pair in the prediction and matching degree and negative sample of sequence pair in the determining positive sample of the transmission function When difference between matching degree meets the requirements, the training to the transmission function is completed.

In a kind of possible way of realization of the application, to after obtaining trained semantic matches model, it can According to current application scenarios, the transmission function being consistent with the demand of current application scenarios is chosen, and then using including choosing Transmission function semantic matches model, carry out semantic matches.

Below with reference to Fig. 3, semantic matches model training method provided by the embodiments of the present application is further described.

Fig. 3 is the flow diagram of another kind semantic matches model training method provided by the embodiment of the present application.

As shown in figure 3, the semantic matches model training method, comprising the following steps:

Step 201, training data is obtained, the training data includes: multiple training samples；The training sample includes: Sequence pair；The type of the sequence pair is, sequence of terms to or phrase sequence pair.

Step 202, for each training sample, using preset mapping layer to the sequence in the training sample to progress Mapping processing, obtains feature vector pair.

Step 203, according to the type of the sequence pair, corresponding transmission function is selected from preset multiple transmission functions To described eigenvector to handling, prediction and matching degree is obtained.

Step 204, it according to the prediction and matching degree of sequence pair in the multiple training sample, to the mapping layer and presets The coefficients of multiple transmission functions be modified, obtain trained semantic matches model.

The specific implementation process and principle of above-mentioned steps 201-204, is referred to the detailed description of above-described embodiment, herein It repeats no more.

Step 205, sequence pair and application scenarios to be processed are obtained.

Wherein, sequence pair to be processed refers to two sequences for currently needing to carry out semantic matches, can be word, It can be phrase, sentence etc..For example, sequence to be processed is to the searching request that can be user's input in search engine application The title for the search result recalled with search engine according to the searching request that user inputs.

It in the embodiment of the present application, may accuracy, response speed etc. to semantic matches for different application scenarios The requirement of performance is different, therefore, to before handling, can obtain current applied field first to sequence to be processed Scape, the requirement with the current application scenarios of determination to performances such as accuracy, the response speeds of semantic matches, and then select and work as The matched semantic matches model of preceding application scenarios.

Step 206, according to the application scenarios, the transmission function in the semantic matches model is adjusted, is obtained The first semantic matches model corresponding with the application scenarios.

In the embodiment of the present application, can be according to the current application scenarios of acquisition, determination and current application scenarios phase The transmission function of symbol, and then according to the transmission function being consistent with current application scenarios, to the biography for including in semantic matches model Delivery function is adjusted, to determine the first semantic matches model corresponding with current application scenarios.

Specifically, after determining the transmission function being consistent with current application scenarios, it can be determined that semantic matches model In whether include the transmission function, and corresponding adjustment is made to the transmission function in semantic matches model.I.e. in the application reality It applies in a kind of possible way of realization of example, above-mentioned steps 206 may include:

Determine the first transmission function corresponding with the application scenarios；

Judge whether first transmission function is included in preset multiple transmission functions；

If first transmission function is included in preset multiple transmission functions, preset multiple transmission functions are deleted In other transmission functions in addition to first transmission function, obtain the first semantic matches corresponding with the application scenarios Model；

If first transmission function is not included in preset multiple transmission functions, first transmission function is used Preset multiple transmission functions in the semantic matches model are replaced, obtain corresponding with the application scenarios One semantic matches model, and the first semantic matches model is trained using the training data in the application scenarios.

It, can be with the mapping relations of preset in advance application scenarios and transmission function, with root as a kind of possible implementation According to the mapping relations of the current application scenarios and application scenarios and transmission function of acquisition, determining and current application scenarios Corresponding first transmission function；Alternatively, the transmission function that also available user inputs in real time, and the user's input that will acquire Transmission function, be determined as corresponding first transmission function of current application scene.

It in the embodiment of the present application, may include multiple transmission functions in semantic matches model, to adapt to different applications The demand of scene.After determining the first transmission function corresponding with current application scenarios, that is, it can determine whether the first transmitting letter Whether number is included in preset multiple transmission functions, if the first transmission function is included in preset multiple transmission functions, Other transmission functions in preset multiple transmission functions in addition to the first transmission function can be deleted, directly to obtain and work as The corresponding first semantic matches model of preceding application scenarios；If the first transmission function is not included in preset multiple transmission functions In, then it can obtain answering with current using preset multiple transmission functions in the first transmission function replacement semantic matches model It is right with the corresponding first semantic matches model of scene, and using the semantic matches model training step of step 201- step 204 First semantic matches model is trained, to obtain trained first semantic matches model, it can according to application scenarios Actual demand, using the preset mapping layer data of the embodiment of the present application as pre-training parameter, initialization to the first semantic matches In the training of model, to realize the transfer learning of semantic matches model, the Rapid Popularization ability of model is improved.

For example, preset multiple transmission functions are BoW model, CNN model, GRNN model, these three transmission functions Prediction accuracy successively improve, and response speed successively reduces.Current application scenarios to the more demanding of prediction accuracy, And the requirement to response speed is lower, then current corresponding first preset function of application scenarios can be determined as GRNN mould Type, and then BoW model and CNN model can be deleted, to obtain the first semantic matches model only including GRNN model.

It further, can also be by multiple transmission functions in some pairs of exigent application scenarios of prediction accuracy Prediction result merged, to further increase the prediction accuracy of semantic matches model.I.e. in the embodiment of the present application one kind In possible way of realization, the quantity of above-mentioned first transmission function is multiple；Correspondingly, above-mentioned determination and the application scenarios After corresponding first transmission function, can also include:

Determine the weighted value of first transmission function.

It, can will be multiple for some pairs of higher application scenarios of prediction accuracy as a kind of possible implementation The prediction result of transmission function is merged, that is, the first transmission function corresponding with current application scenarios determined has multiple.

It preferably, can also be to each when determining the first transmission function corresponding with current application scenarios has multiple The corresponding prediction and matching degree of first transmission function, is arranged different weights, to further increase the pre- of the first semantic matches model Survey accuracy and flexibility, it can the prediction by adjusting the weight of each first transmission function, to the first semantic matches model The performances such as accuracy, response speed are adjusted.

Optionally, the weighted value of corresponding each first transmission function of application scenarios can be preset in advance, that is, get After current application scenarios, it can determine the weighted value of corresponding each first transmission function of current application scenarios；Or Person can also obtain user in real time and instruct to the setting of the weighted value of the first transmission function, and be referred to according to the setting that user inputs It enables, determines the weighted value of each first transmission function.

It should be noted that determining the mode of the weighted value of each first transmission function, can include but is not limited to above-listed The situation of act.In actual use, the mode for determining the weighted value of the first transmission function, this Shen can be preset according to actual needs Please embodiment do not limit this.

Step 207, the sequence to be processed is obtained described to be processed to the first semantic matches model is inputted Prediction and matching degree between two sequences of sequence centering.

In the embodiment of the present application, after determining the corresponding first semantic matches model of current application scenarios It is pre- between two sequences of sequence centering to be processed to obtain by sequence to be processed to the first semantic matches model of input Survey matching degree.

As shown in Fig. 4-1, Fig. 4-2, Fig. 4-3, respectively transmission function be BoW model semantic matches model signal Figure, the schematic diagram for the semantic matches model that transmission function is CNN model, the semantic matches model that transmission function is GRNN model Schematic diagram.Wherein, query and title is respectively two sequences of sequence centering to be processed, and Embedding is preset reflects Layer is penetrated, cos_1, cos_2, cos_3 are respectively query and title (the sequence centering to be processed of corresponding model output Two sequences) between prediction and matching degree.

It further, can also be according to each first transmitting letter when corresponding first transmission function of application scenarios has multiple Several weighted values merges the corresponding prediction and matching degree of each first transmission function.I.e. in a kind of possibility of the embodiment of the present application Way of realization in, above-mentioned steps 207 may include:

By the sequence to be processed to the first semantic matches model is inputted, each first transmission function output is obtained Prediction and matching degree；

According to the weighted value of each first transmission function, summation is weighted to the prediction and matching degree of output, is obtained described The prediction and matching degree of sequence pair to be processed.

As a kind of possible implementation, if corresponding first preset function of application scenarios have it is multiple, can will be to The sequence of processing is to the first semantic matches model of input, to obtain the prediction and matching degree of each first transmission function output, in turn According to the weighted value of each first transmission function, summation is weighted to the prediction and matching degree of output, to obtain sequence to be processed The prediction and matching degree of column pair.

Semantic matches model training method provided by the embodiments of the present application, what can be will acquire includes multiple training samples Training data inputs preset mapping layer, and according to the prediction and matching degree of preset multiple transmission functions outputs, to mapping layer with And the coefficient of preset multiple transmission functions is modified, and is obtained trained semantic matches model, is obtained later to be processed Sequence to and application scenarios, and according to application scenarios, the transmission function in semantic matches model is adjusted, obtain with The corresponding first semantic matches model of application scenarios, and then sequence to be processed obtains the first semantic matches model of input Prediction and matching degree between two sequences of sequence centering to be processed.As a result, by according to current application scenarios, choose with The semantic matches model that the demand of current application scenarios is consistent not only shortens the cycle of training of semantic matches model, reduces Trained cost, and improve the prediction accuracy and flexibility of semantic matches model.

In order to realize above-described embodiment, the application also proposes a kind of semantic matches model training apparatus.

Fig. 5 is a kind of structural schematic diagram of semantic matches model training apparatus provided by the embodiments of the present application.

As shown in figure 5, the semantic matches model training apparatus 30, comprising:

Module 31 is obtained, for obtaining training data, the training data includes: multiple training samples；The trained sample It originally include: sequence pair；The type of the sequence pair is, sequence of terms to or phrase sequence pair；

Mapping block 32, for being directed to each training sample, using preset mapping layer to the sequence in the training sample Column obtain feature vector pair to mapping processing is carried out；

Processing module 33 selects corresponding for the type according to the sequence pair from preset multiple transmission functions Transmission function, to handling, obtains prediction and matching degree to described eigenvector；

Correction module 34, for the prediction and matching degree according to sequence pair in the multiple training sample, to the mapping layer And the coefficient of preset multiple transmission functions is modified, and obtains trained semantic matches model.

In actual use, semantic matches model training apparatus provided by the embodiments of the present application can be configured in any In electronic equipment, to execute aforementioned semantic matches model training method.

Semantic matches model training apparatus provided by the embodiments of the present application, the available training including multiple training samples Data, training sample include: sequence pair, and the type of sequence pair is, sequence of terms to or phrase sequence pair, and for each Training sample obtains feature vector pair to mapping processing is carried out to the sequence in training sample using preset mapping layer, it Afterwards according to the type of sequence pair, select corresponding transmission function to feature vector to progress from preset multiple transmission functions Processing obtains prediction and matching degree, and then according to the prediction and matching degree of sequence pair in multiple training samples, to mapping layer and in advance If the coefficients of multiple transmission functions be modified, obtain trained semantic matches model.As a result, by using identical Mapping layer obtains the prediction and matching degree of multiple transmission functions respectively, and then according to the corresponding prediction of multiple transmission functions The coefficient of multiple transmission functions is modified with degree, to realize using identical mapping layer simultaneously to multiple transmitting letters Number is trained, and is shortened the cycle of training of semantic matches model, is reduced trained cost.

In a kind of possible way of realization of the application, above-mentioned sequence of terms includes: bag of words mould to corresponding transmission function Type；Above-mentioned phrase sequence includes: bag of words, convolutional neural networks model and general regression neural to corresponding transmission function Network model；

Correspondingly, above-mentioned processing module 33, is specifically used for:

Further, in the alternatively possible way of realization of the application, above-mentioned semantic matches model training apparatus 30, also Include:

Second obtains module, for obtaining sequence pair and application scenarios to be processed；

Module is adjusted, for being adjusted to the transmission function in the semantic matches model according to the application scenarios, Obtain the first semantic matches model corresponding with the application scenarios；

Input module, for the sequence to be processed to the first semantic matches model is inputted, is obtained it is described to Prediction and matching degree between two sequences of sequence centering of processing.

Further, in the application in another possible way of realization, above-mentioned adjustment module is also used to:

Further, in the application in another possible way of realization, the quantity of above-mentioned first transmission function is multiple；

Correspondingly, above-mentioned semantic matches model training apparatus 30, further includes:

Determining module, for determining the weighted value of first transmission function；

The input module, is also used to:

It should be noted that the aforementioned explanation to Fig. 1, semantic matches model training method embodiment shown in Fig. 3 It is also applied for the semantic matches model training apparatus 30 of the embodiment, details are not described herein again.

Semantic matches model training apparatus provided by the embodiments of the present application, what can be will acquire includes multiple training samples Training data inputs preset mapping layer, and according to the prediction and matching degree of preset multiple transmission functions outputs, to mapping layer with And the coefficient of preset multiple transmission functions is modified, and is obtained trained semantic matches model, is obtained later to be processed Sequence to and application scenarios, and according to application scenarios, the transmission function in semantic matches model is adjusted, obtain with The corresponding first semantic matches model of application scenarios, and then sequence to be processed obtains the first semantic matches model of input Prediction and matching degree between two sequences of sequence centering to be processed.As a result, by according to current application scenarios, choose with The semantic matches model that the demand of current application scenarios is consistent not only shortens the cycle of training of semantic matches model, reduces Trained cost, and improve the prediction accuracy and flexibility of semantic matches model.

In order to realize above-described embodiment, the application also proposes a kind of electronic equipment.

Fig. 6 is the structural schematic diagram of the electronic equipment of one embodiment of the invention.

As shown in fig. 6, above-mentioned electronic equipment 200 includes:

Memory 210 and processor 220 connect the bus 230 of different components (including memory 210 and processor 220), Memory 210 is stored with computer program, realizes language described in the embodiment of the present application when processor 220 executes described program Adopted Matching Model training method.

Bus 230 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures. For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) Bus.

Electronic equipment 200 typically comprises various electronic readable medium.These media can be it is any can be electric The usable medium that sub- equipment 200 accesses, including volatile and non-volatile media, moveable and immovable medium.

Memory 210 can also include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 240 and/or cache memory 250.Electronic equipment 200 may further include it is other it is removable/no Movably, volatile/non-volatile computer system storage medium.Only as an example, storage system 260 can be used for reading Write immovable, non-volatile magnetic media (Fig. 6 do not show, commonly referred to as " hard disk drive ").Although not showing in Fig. 6 Out, the disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided, and to removable The CD drive of anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, Each driver can be connected by one or more data media interfaces with bus 230.Memory 210 may include to A few program product, the program product have one group of (for example, at least one) program module, these program modules are configured to Execute the function of each embodiment of the application.

Program/utility 280 with one group of (at least one) program module 270, can store in such as memory In 210, such program module 270 include --- but being not limited to --- operating system, one or more application program, its It may include the realization of network environment in its program module and program data, each of these examples or certain combination. Program module 270 usually executes function and/or method in embodiments described herein.

Electronic equipment 200 can also be with one or more external equipments 290 (such as keyboard, sensing equipment, display 291 Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 200 communicate, and/or with make The electronic equipment 200 any equipment (such as network interface card, the modulatedemodulate that can be communicated with one or more of the other calculating equipment Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 292.Also, electronic equipment 200 may be used also To pass through network adapter 293 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network Network, such as internet) communication.As shown, network adapter 293 passes through other modules of bus 230 and electronic equipment 200 Communication.It should be understood that although not shown in the drawings, other hardware and/or software module, packet can be used in conjunction with electronic equipment 200 It includes but is not limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, magnetic tape drive Device and data backup storage system etc..

Program of the processor 220 by operation storage in memory 210, thereby executing various function application and data Processing.

It should be noted that the implementation process and technical principle of the electronic equipment of the present embodiment are referring to aforementioned to the application reality The explanation of the semantic matches model training method of example is applied, details are not described herein again.

Electronic equipment provided by the embodiments of the present application can execute foregoing semantic matches model training method, obtain Take the training data including multiple training samples, training sample includes: sequence pair, and the type of sequence pair is, sequence of terms pair or Person's phrase sequence pair, and it is directed to each training sample, using preset mapping layer to the sequence in training sample to mapping Processing, obtains feature vector pair, later according to the type of sequence pair, selects corresponding biography from preset multiple transmission functions Delivery function, to handling, obtains prediction and matching degree, and then according to the prediction of sequence pair in multiple training samples to feature vector Matching degree is modified the coefficient of mapping layer and preset multiple transmission functions, obtains trained semantic matches mould Type.Obtain the prediction and matching degree of multiple transmission functions respectively by using identical mapping layer as a result, and then according to multiple biographies The corresponding prediction and matching degree of delivery function is modified the coefficient of multiple transmission functions, to realize using identical Mapping layer is simultaneously trained multiple transmission functions, shortens the cycle of training of semantic matches model, reduces and be trained to This.

In order to realize above-described embodiment, the application also proposes a kind of computer readable storage medium.

Wherein, the computer readable storage medium, is stored thereon with computer program, when which is executed by processor, To realize semantic matches model training method described in the embodiment of the present application.

In order to realize above-described embodiment, the application another further aspect embodiment provides a kind of computer program product, the calculating When instruction in machine program product is executed by processor, to realize semantic matches model training side described in the embodiment of the present application Method.

In a kind of optional way of realization, the present embodiment can be using any group of one or more computer-readable media It closes.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable to deposit Storage media for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor Part, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only storage of one or more conducting wires Device (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer can Read storage medium can be it is any include or storage program tangible medium, the program can be commanded execution system, device or The use or in connection of person's device.

Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium other than computer readable storage medium, which can send, propagate or Transmission is for by the use of instruction execution system, device or device or program in connection.

The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.

The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It is fully executed on consumer electronic devices, partly executes on consumer electronic devices, held as an independent software package Row, partially part executes in devices in remote electronic or completely in devices in remote electronic or service on consumer electronic devices It is executed on device.In the situation for being related to devices in remote electronic, devices in remote electronic can pass through the network of any kind --- packet It includes local area network (LAN) or wide area network (WAN)-is connected to consumer electronic devices, or, it may be connected to external electronic device (such as being connected using ISP by internet).

Those skilled in the art will readily occur to its of the application after considering specification and practicing the invention applied here Its embodiment.This application is intended to cover any variations, uses, or adaptations of the application, these modifications, purposes or The common knowledge in the art that person's adaptive change follows the general principle of the application and do not invent including the application Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the application are wanted by right It asks and points out.

It should be understood that the application is not limited to the precise structure that has been described above and shown in the drawings, and And various modifications and changes may be made without departing from the scope thereof.Scope of the present application is only limited by the accompanying claims.

Claims

1. a kind of semantic matches model training method characterized by comprising

Training data is obtained, the training data includes: multiple training samples；The training sample includes: sequence pair；The sequence Column pair type be, sequence of terms to or phrase sequence pair；

For each training sample, the sequence in the training sample is obtained to mapping processing is carried out using preset mapping layer To feature vector pair；

According to the type of the sequence pair, selected from preset multiple transmission functions corresponding transmission function to the feature to Amount obtains prediction and matching degree to handling；

According to the prediction and matching degree of sequence pair in the multiple training sample, to the mapping layer and preset multiple transmitting letters Several coefficients are modified, and obtain trained semantic matches model.

2. the method according to claim 1, wherein the sequence of terms includes: word to corresponding transmission function Bag model；

The phrase sequence includes: bag of words, convolutional neural networks model and general regression neural to corresponding transmission function Network model；

The type according to the sequence pair selects corresponding transmission function to the spy from preset multiple transmission functions Vector is levied to handling, obtains prediction and matching degree, comprising:

If the type of the sequence pair is sequence of terms pair, described eigenvector is obtained to handling using bag of words To prediction and matching degree；

If the type of the sequence pair is phrase sequence pair, bag of words, convolutional neural networks model and broad sense is respectively adopted Recurrent neural network model, to handling, obtains prediction and matching degree to described eigenvector.

3. the method according to claim 1, wherein after obtaining trained semantic matches model, further includes:

Obtain sequence pair and application scenarios to be processed；

According to the application scenarios, the transmission function in the semantic matches model is adjusted, is obtained and the applied field The corresponding first semantic matches model of scape；

By the sequence to be processed to the first semantic matches model is inputted, described sequence centering two to be processed are obtained Prediction and matching degree between sequence.

4. according to the method described in claim 3, it is characterized in that, described according to the application scenarios, to the semantic matches Transmission function in model is adjusted, and obtains the first semantic matches model corresponding with the application scenarios, comprising:

If first transmission function is included in preset multiple transmission functions, deletes and removed in preset multiple transmission functions Other transmission functions except first transmission function obtain the first semantic matches model corresponding with the application scenarios；

If first transmission function is not included in preset multiple transmission functions, using first transmission function to institute Preset multiple transmission functions in predicate justice Matching Model are replaced, and it is semantic to obtain corresponding with the application scenarios first Matching Model, and the first semantic matches model is trained using the training data in the application scenarios.

5. according to the method described in claim 3, it is characterized in that, if the quantity of first transmission function be it is multiple, it is described Method further include:

Determine the weighted value of first transmission function；

It is described by the sequence to be processed to the first semantic matches model is inputted, obtain the sequence centering to be processed Prediction and matching degree between two sequences, comprising:

By the sequence to be processed to the first semantic matches model is inputted, the pre- of each first transmission function output is obtained Survey matching degree；

According to the weighted value of each first transmission function, summation is weighted to the prediction and matching degree of output, is obtained described wait locate The prediction and matching degree of the sequence pair of reason.

6. a kind of semantic matches model training apparatus characterized by comprising

First obtains module, and for obtaining training data, the training data includes: multiple training samples；The training sample It include: sequence pair；The type of the sequence pair is, sequence of terms to or phrase sequence pair；

Mapping block, for be directed to each training sample, using preset mapping layer to the sequence in the training sample into Row mapping processing, obtains feature vector pair；

Processing module selects corresponding transmitting letter for the type according to the sequence pair from preset multiple transmission functions Several pairs of described eigenvectors obtain prediction and matching degree to handling；

Correction module, for the prediction and matching degree according to sequence pair in the multiple training sample, to the mapping layer and in advance If the coefficients of multiple transmission functions be modified, obtain trained semantic matches model.

7. device according to claim 6, which is characterized in that the sequence of terms includes: word to corresponding transmission function Bag model；

The processing module is specifically used for,

8. device according to claim 6, which is characterized in that further include:

Module is adjusted, for being adjusted, obtaining to the transmission function in the semantic matches model according to the application scenarios The first semantic matches model corresponding with the application scenarios；

Input module, for the first semantic matches model is inputted, obtaining described to be processed the sequence to be processed Two sequences of sequence centering between prediction and matching degree.

9. device according to claim 8, which is characterized in that the adjustment module is specifically used for,

10. device according to claim 8, which is characterized in that if the quantity of first transmission function be it is multiple, it is described Device further include: determining module, for determining the weighted value of first transmission function；

The input module is specifically used for,

11. a kind of electronic equipment characterized by comprising

Memory, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that institute It states when processor executes described program and realizes such as semantic matches model training method as claimed in any one of claims 1 to 5.

12. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor Such as semantic matches model training method as claimed in any one of claims 1 to 5 is realized when execution.

13. a kind of computer program product is realized when the instruction in the computer program product is executed by processor as weighed Benefit requires any semantic matches model training method in 1-5.