CN117668342A

CN117668342A - Training method of double-tower model and commodity recall method

Info

Publication number: CN117668342A
Application number: CN202311685717.XA
Authority: CN
Inventors: 李明明; 王彬彬; 卓靖炜; 刘林
Original assignee: Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2023-12-08
Filing date: 2023-12-08
Publication date: 2024-03-08

Abstract

The present disclosure provides a training method of a double-tower model and a commodity recall method, the training method includes obtaining a plurality of positive and negative sample pairs, each positive and negative sample pair including positive and negative samples corresponding to the same sample query word; inputting the positive and negative sample pairs into a double-tower model for training, and obtaining a first super-parameter corresponding to the positive sample and a second super-parameter corresponding to the negative sample; according to the positive sample and the first super parameter, the negative sample and the second super parameter, obtaining an objective function for optimizing model parameters of the double-tower model; according to the method, the super parameters of sample level can be obtained by inputting positive and negative sample pairs into the double-tower model for training, the adaptability and the high efficiency of obtaining the super parameters are improved, and the training efficiency and the model performance of the double-tower model are improved.

Description

Training method of double-tower model and commodity recall method

Technical Field

The disclosure relates to the technical field of information processing, in particular to a training method of a double-tower model and a commodity recall method.

Background

At present, for commodity recall scenes, related commodities are often determined based on a double-tower model, however, an objective function of the double-tower model is very sensitive to super parameters, in the related technology, the super parameters can be traversed as much as possible through a grid search method, namely by a tabulation mode, then the model with the best performance is selected, and the selection of the parameters can be accelerated through a super parameter optimization method of nerve structure search (Neural Architecture Search, NAS for short), namely through an approximate double-layer optimization training model, however, the method has lower efficiency of acquiring the super parameters, higher training cost and lower self-adaptability of the super parameters, so that how to acquire the super parameters more efficiently, improve the self-adaptability of the super parameters and improve the performance of the model, and the problem to be solved is urgent.

Disclosure of Invention

The present disclosure provides a training method, commodity recall method, apparatus, electronic device, storage medium and computer program product for a dual tower model.

An embodiment of a first aspect of the present disclosure provides a training method for a dual-tower model, including: acquiring a plurality of positive and negative sample pairs, wherein each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, the positive sample further comprises first commodity information, and the negative sample further comprises second commodity information; inputting the positive and negative sample pairs into a double-tower model for training, and obtaining a first super-parameter corresponding to the positive sample and a second super-parameter corresponding to the negative sample, wherein the first super-parameter and the second super-parameter are used for obtaining an objective function for optimizing model parameters of the double-tower model; obtaining an objective function for optimizing model parameters of the double-tower model according to the positive sample, the first super parameter, the negative sample and the second super parameter; and optimizing the model parameters of the double-tower model according to the objective function, and returning to continue training until the training ending condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining the mapping relation between the query word and commodity information.

In the embodiment of the disclosure, a plurality of positive and negative sample pairs are obtained, each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, wherein the positive sample further comprises first commodity information, the negative sample further comprises second commodity information, the positive and negative sample pairs are input into a double-tower model for training, a first super parameter corresponding to the positive sample and a second super parameter corresponding to the negative sample are obtained, the first super parameter and the second super parameter are used for obtaining an objective function for optimizing model parameters of the double-tower model, according to the positive sample, the first super parameter, the negative sample and the second super parameter, the objective function for optimizing the model parameters of the double-tower model is obtained, and training is continued until a training end condition is met, wherein the objective double-tower model is used for determining a mapping relation between the query word and the commodity information.

An embodiment of a second aspect of the present disclosure provides a commodity recall method, including: acquiring a user query word, inputting the user query word into a target double-tower model, and determining a first target vector corresponding to the user query word based on the user query word through the target double-tower model; determining a second target vector corresponding to the candidate commodity through the target double-tower model, and establishing an index set according to the second target vector; and searching candidate indexes in the index set according to the first target vector to determine target commodities associated with the user query word, wherein the target information matching model is a model trained by the method according to any one of the first aspects.

In the embodiment of the disclosure, the user query word is input to the target double-tower model by acquiring the user query word, the first target vector corresponding to the user query word is determined based on the user query word through the target double-tower model, the second target vector corresponding to the candidate commodity is determined through the target double-tower model, the index set is established according to the second target vector, the candidate index in the index set is searched according to the first target vector, and the target commodity associated with the user query word is determined.

An embodiment of a third aspect of the present disclosure provides a training device for a dual-tower model, including: the first acquisition module is used for acquiring a plurality of positive and negative sample pairs, wherein each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, the positive sample further comprises first commodity information, and the negative sample further comprises second commodity information; the second acquisition module is used for inputting the positive and negative sample pairs into a double-tower model for training, and acquiring a first super-parameter corresponding to the positive sample and a second super-parameter corresponding to the negative sample, wherein the first super-parameter and the second super-parameter are used for acquiring an objective function for optimizing model parameters of the double-tower model; the third acquisition module is used for acquiring an objective function for optimizing model parameters of the double-tower model according to the positive sample, the first super-parameter, the negative sample and the second super-parameter; and the training module is used for optimizing the model parameters of the double-tower model according to the objective function and returning to continue training until the training ending condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining the mapping relation between the query word and the commodity information.

An embodiment of a fourth aspect of the present disclosure provides a commodity recall device, including: the first acquisition module is used for acquiring a plurality of positive and negative sample pairs, wherein each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word; the second acquisition module is used for inputting the positive and negative sample pairs into a double-tower model for training, and acquiring a first super parameter corresponding to the positive sample and a second super parameter corresponding to the negative sample; the third acquisition module is used for acquiring an objective function for optimizing model parameters of the double-tower model according to the positive sample, the first super-parameter, the negative sample and the second super-parameter; and the training module is used for optimizing the model parameters of the double-tower model according to the objective function and returning to continue training until the training ending condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining the mapping relation between the query word and the commodity.

An embodiment of a fifth aspect of the present disclosure proposes an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method of the double-tower model as described above for the first aspect embodiment or the commodity recall method of the second aspect embodiment.

An embodiment of a sixth aspect of the present disclosure proposes a computer-readable storage medium storing computer instructions for causing the computer to perform the training method of the double tower model as the embodiment of the first aspect or the commodity recall method of the embodiment of the second aspect described above.

An embodiment of a seventh aspect of the present disclosure proposes a computer program product comprising a computer program which, when executed by a processor, implements the training method of the dual tower model of the embodiment of the first aspect of the present disclosure or the merchandise recall method of the embodiment of the second aspect.

Additional aspects and advantages of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.

Drawings

The foregoing and/or additional aspects and advantages of the present disclosure will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a training method of a dual-tower model according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a training method of a dual tower model according to another embodiment of the present disclosure;

FIG. 3 is a flow chart of a training method of a dual tower model according to another embodiment of the present disclosure;

FIG. 4 is a flow chart of a training method of a dual tower model according to another embodiment of the present disclosure;

FIG. 5 is a flow chart of a commodity recall method according to an embodiment of the present disclosure;

FIG. 6 is a schematic structural diagram of a training device for a dual tower model according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a commodity recall device according to an embodiment of the present disclosure;

fig. 8 is a block diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present disclosure and are not to be construed as limiting the present disclosure.

The following describes a training method, an apparatus, an electronic device, and a storage medium of a dual-tower model according to an embodiment of the present disclosure with reference to the accompanying drawings.

Fig. 1 is a flow chart of a training method of a dual-tower model according to an embodiment of the disclosure.

As shown in fig. 1, the method comprises the steps of:

s101, a plurality of positive and negative sample pairs are obtained, each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, wherein the positive sample further comprises first commodity information, and the negative sample further comprises second commodity information.

It should be noted that, the specific manner of acquiring a plurality of positive and negative sample pairs is not limited in this disclosure, and may be selected according to actual situations.

Optionally, historical behavior data of the user in a preset time window may be obtained, a sample query word query is obtained from the historical behavior data, first commodity information clicked by the user in the sample query word query is obtained, the first commodity information clicked by the user is taken as positive sample positive, second commodity information not clicked by the user in the sample query word query is obtained, and the second commodity information not clicked by the user is taken as negative sample negative, so that a plurality of positive and negative sample pairs are obtained.

It should be noted that, the sample query term includes a sample user feature and a query term feature, for example: the sample user features may include features of gender, age, region, etc. of the sample user, the query term features may include features of a query target, etc., and the features of the merchandise information include merchandise title features, merchandise affiliated store features, merchandise price features, merchandise brand features, etc.

It should be noted that, the setting of the preset time window is not limited in the present disclosure, and may be set according to actual situations.

Alternatively, the preset time window may be set to 5 months, 1 year, or the like.

S102, inputting the positive and negative sample pairs into a double-tower model for training, and obtaining a first super parameter corresponding to the positive sample and a second super parameter corresponding to the negative sample, wherein the first super parameter and the second super parameter are used for obtaining an objective function of model parameters of the optimized double-tower model.

The double-tower model comprises a user tower and a commodity tower, and the structure of each tower is divided into an input layer, a coding layer and an output layer.

Alternatively, the dual-tower model may be a deep semantic matching model (Deep Structured Semantic Models, DSSM for short).

In the embodiment of the disclosure, after the positive and negative sample pairs are obtained, the positive and negative sample pairs may be input into a double-tower model for training, and the first super-parameters corresponding to the positive samples and the second super-parameters corresponding to the negative samples are obtained.

It should be noted that, in the model training process, the model performance is very sensitive to the super parameter, and different samples should be corresponding to different super parameters, so the disclosure provides a parameter-free super parameter acquisition method.

In the embodiment of the disclosure, the positive and negative sample pairs may be input into the twin-tower model for training, and a preset first learning parameter is obtained as a first super parameter τ in the training process _p 。

In the embodiment of the disclosure, positive and negative sample pairs may be input into a double-tower model, sample word vectors of sample query words are respectively obtained by user towers in the double-tower model, first sample vectors of first commodity information and second sample vectors of second commodity information are obtained by commodity towers in the double-tower model, and second super parameters tau corresponding to negative samples are obtained based on the sample word vectors, the first sample vectors and the second sample vectors _n 。

S103, obtaining an objective function for optimizing model parameters of the double-tower model according to the positive sample and the first super parameter, and the negative sample and the second super parameter.

In the embodiment of the disclosure, positive and negative sample pairs can be input into a double-tower model, sample word vectors of sample query words are respectively obtained by a user tower in the double-tower model, and first sample vectors of first commodity information and second sample vectors of second commodity information are obtained by commodity towers in the double-tower model.

Alternatively, a first similarity between the first sample vector and the second sample vector may be obtained, a second similarity between the sample word vector and the first sample vector may be obtained, a third similarity between the sample word vector and the second sample vector may be obtained, a first objective function and a second objective function of the dual-tower model may be obtained according to the first similarity, the second similarity, the third similarity, the first super parameter and the second super parameter, and an objective function of the dual-tower model may be obtained according to the first objective function and the second objective function.

And S104, optimizing model parameters of the double-tower model according to the objective function, returning to continue training until the training ending condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining the mapping relation between the query word and commodity information.

In the embodiment of the disclosure, after the objective function is obtained, model parameters of the double-tower model can be optimized according to the objective function, and training of the double-tower model after the model parameters are optimized is continued until the training ending condition is met, so that the objective double-tower model is obtained.

Optionally, training the double-tower model by using a gradient back propagation algorithm, and optimizing model parameters of the double-tower model by using an Adam optimizer according to an objective function until a training ending condition is met to obtain the target double-tower model.

It should be noted that, the setting of the model training ending condition is not limited in the present disclosure, and the setting of the model training ending condition may be performed according to actual situations.

Optionally, the model training ending condition may be set such that the loss function value is smaller than a preset loss threshold;

optionally, the model training ending condition may be set so that the number of optimization times of the model parameters of the double-tower model reaches a preset number threshold.

In the embodiment of the disclosure, the learning rate of the double-tower model can be set to be 5e-5, the data volume of each optimization batch is 256, and the iterative optimization times are 100 ten thousand times.

In the embodiment of the disclosure, a plurality of positive and negative sample pairs are obtained, each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, the positive and negative sample pairs are input into a double-tower model for training, a first super parameter corresponding to the positive sample and a second super parameter corresponding to the negative sample are obtained, an objective function for optimizing model parameters of the double-tower model is obtained according to the positive sample and the first super parameter, and the negative sample and the second super parameter, the model parameters of the double-tower model are optimized and returned for continuous training according to the objective function until a training end condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining a mapping relation between the query word and commodity information.

Fig. 2 is a flow chart of a training method of a dual-tower model according to an embodiment of the present disclosure, and further illustrates a process of obtaining a second super parameter with reference to fig. 2 based on the above embodiment, including the following steps:

S201, inputting positive and negative sample pairs into a double-tower model, respectively obtaining sample word vectors of sample query words by a user tower in the double-tower model, and obtaining first sample vectors of first commodity information and second sample vectors of second commodity information by commodity towers in the double-tower model.

The positive sample comprises first commodity information, and the negative sample comprises second commodity information.

It should be noted that, the user tower and the commodity tower all include input layer, coding layer and output layer, wherein, input layer can be with the vector of input data coding fixed length, and the coding layer converts and fuses input data, and the coding layer can be arbitrary deep, shallow layer network structure, for example: the output layer converts the encoded information into a vector of fixed dimension, which may be a BERT encoder (4 layers).

In the embodiment of the disclosure, positive and negative sample pairs may be input into a double-tower model, sample word vectors Q of sample query words are respectively obtained by user towers in the double-tower model, and first sample vectors I of first commodity information are obtained by commodity towers in the double-tower model _p And a second sample vector I of the second commodity information _n 。

S202, based on the sample word vector, the first sample vector and the second sample vector, obtaining a second super parameter corresponding to the negative sample.

On the basis of the above embodiment, further referring to fig. 3, a process of obtaining a second super parameter corresponding to a negative sample based on a sample word vector, a first sample vector and a second sample vector is explained, which includes the following steps:

s301, obtaining a first similarity between the first sample vector and the second sample vector.

In the embodiment of the present disclosure, after the first sample vector and the second sample vector are acquired, a first similarity between the first sample vector and the second sample vector may be acquired.

Alternatively, a first sample vector I may be calculated _p And a second sample vector I _n To obtain a first similarity s (I _p ，I _n )。

S302, acquiring a second learning parameter and a third learning parameter, and acquiring a second super parameter corresponding to the negative sample according to the second learning parameter, the third learning parameter and the first similarity.

The second learning parameter and the third learning parameter may be globally fixed, or may be learned, for example: initializing a variable for the second learning parameter and the third learning parameter, and updating learning acquisition through gradient propagation.

In the embodiment of the present disclosure, the following formula may be used to obtain the second super parameter corresponding to the negative sample:

τ _n ＝α*(1-<sg(I _p )，I _n >)+b

Wherein τ _n Is a second super parameter, alpha is a second learning parameter, b is a third learning parameter, sg is gradient cut-off,<sg(I _p )，I _n >is the first similarity.

In the embodiment of the disclosure, in the process of acquiring the first super parameter, the positive and negative sample pairs may be input into the double-tower model for training, and a preset first learning parameter is acquired as the first super parameter in the training process.

It should be noted that, for the first super-parameter, a preset first learning parameter may be used as the first super-parameter, i.e., τ _p ＝α*(1-<sg(I _P )，I _p >)+c＝c

In the embodiment of the disclosure, positive and negative sample pairs are input into a double-tower model, sample word vectors of sample query words are respectively acquired by a user tower in the double-tower model, a first sample vector of first commodity information and a second sample vector of second commodity information are acquired by a commodity tower in the double-tower model, second super parameters corresponding to negative samples are acquired based on the sample word vectors, the first sample vector and the second sample vector, positive and negative sample pairs are input into the double-tower model for training, and a preset first learning parameter is acquired as a first super parameter in the training process.

Fig. 4 is a flow chart of a training method of a dual-tower model according to an embodiment of the present disclosure, and further with reference to fig. 4, a process of obtaining an objective function for optimizing model parameters of the dual-tower model according to a positive sample and a first super parameter, and a negative sample and a second super parameter is explained based on the above embodiment, and includes the following steps:

s401, obtaining a second similarity between the sample word vector and the first sample vector, and obtaining a third similarity between the sample word vector and the second sample vector.

Alternatively, a sample word vector Q and a first sample vector I may be calculated _p Inner product between to obtain a second similarity s (Q, I _p )。

Alternatively, a sample word vector Q and a second sample vector I may be calculated _n Inner product between the two to obtain a third similarity s (Q, I) _n )。

S402, acquiring a first objective function and a second objective function of the double-tower model according to the first similarity, the second similarity, the third similarity, the first super parameter and the second super parameter.

In the embodiment of the disclosure, after the first similarity, the second similarity, the third similarity, the first superparameter and the second superparameter are obtained, two ways may be used to obtain the first objective function and the second objective function of the dual-tower model.

For the first mode: the first objective function may be obtained based on the second similarity, the first hyper-parameter, the third similarity, and the second hyper-parameter.

In the disclosed embodiment, the first objective function may be obtained using the following formula:

wherein L is _soft For a first objective function, s (Q, I _p ) For a second similarity, s (Q, I _n ) For a third similarity, τ _p For the first super parameter τ _n Is the second super parameter.

Note that, for the first mode: the second objective function may be obtained based on the first similarity, the first hyper-parameter, the second similarity, and the second hyper-parameter.

In the disclosed embodiment, the second objective function may be obtained using the following formula:

wherein L is _symm1 For the second objective function, s (I _p ，I _n ) For the first similarity, s (Q, I _p ) For a second degree of similarity, τ _p For the first super parameter τ _n Is the second super parameter.

Note that, for the second mode: the first objective function may be obtained based on the second similarity, the third similarity, and the second hyper-parameter.

wherein L is _margin For a first objective function, s (Q, I _p ) For a second similarity, s (Q, I _n ) For a third similarity, τ _n Is the second super parameter.

It should be noted that, for the second manner, the second objective function may be obtained according to the first similarity, the first super parameter, and the second similarity.

wherein L is _symm2 For the second objective function, s (I _p ，I _n ) For the first similarity, s (Q, I _p ) For a second degree of similarity, τ _p Is the first super parameter.

S403, acquiring an objective function of the double-tower model according to the first objective function and the second objective function.

In the embodiment of the disclosure, after the first objective function and the second objective function are acquired, the sum value of the first objective function and the second objective function may be acquired, and the sum value is used as the objective function of the double-tower model.

Optionally, for the first objective function, L _soft A second objective function of L _symm1 Then the objective function of the double tower model l=l _soft +L _symm1 The method comprises the steps of carrying out a first treatment on the surface of the Optionally, for the first objective function, L _margin A second objective function of L _symm2 Then the objective function of the double tower model l=l _margin +L _symm2 。

In the embodiment of the disclosure, after the objective function for optimizing the model parameters of the double-tower model is obtained, the model parameters of the double-tower model may be optimized according to the objective function and returned to continue training until the training end condition is satisfied to obtain the target double-tower model.

In the embodiment of the disclosure, the first objective function and the second objective function of the double-tower model are obtained according to the first similarity, the second similarity, the third similarity, the first super parameter and the second super parameter, and the first objective function and the second objective function of the double-tower model are obtained by obtaining the second similarity between the sample word vector and the first sample vector, and obtaining the third similarity between the sample word vector and the second sample vector.

Fig. 5 is a flowchart of a commodity recall method according to an embodiment of the present disclosure.

As shown in fig. 5, the method comprises the steps of:

s501, acquiring a user query word, inputting the user query word into a target double-tower model, and determining a first target vector corresponding to the user query word based on the user query word through the target double-tower model.

It should be noted that, the user query term (query) is query information input by the user at the terminal device, for example: query information input by a user on a mobile phone and query information input by the user on a computer.

It should be noted that, after the user inputs the query information (user query word) at the terminal device, the user query word may be directly obtained.

It should be noted that the present disclosure is not limited to the form of the query words of the user. For example: the user query term may be a word, sentence, paragraph of text, etc.

For example, the user query term may be "cell phone"; the user query term may also be "give birth to girl gift".

It should be noted that, the target double-tower model is pre-trained by adopting a deep learning algorithm, and the target double-tower model can determine the first target vector corresponding to the user query word in real time based on the user query word.

S502, determining a second target vector corresponding to the candidate commodity through a target double-tower model, and establishing an index set according to the second target vector.

In the embodiment of the application, the second target vector corresponding to the candidate commodity can be predetermined through the target double-tower model, and the index set is established according to the target vector.

And S503, searching candidate indexes in the index set according to the first target vector, and determining target commodities associated with the user query word.

In the embodiment of the disclosure, the similarity between the first target vector and the candidate index may be obtained, the target index is determined from the candidate indexes according to the similarity, and the candidate commodity associated with the target index is taken as the target commodity.

Optionally, if the similarity between the first target vector and the candidate index is greater than the similarity threshold, the candidate index is determined to be the target index.

It should be noted that, the setting of the similarity threshold is not limited in this disclosure, and may be set according to actual situations. For example: the similarity threshold may be set to 80%; also for example: the similarity threshold may be set at 85%.

For example, for a similarity threshold of 85%, if the similarity between the first target vector and the candidate index is greater than 85%, the corresponding candidate index is determined to be the target index.

Alternatively, the candidate indexes may be sorted in descending order of the similarity, and the candidate indexes of the top N of the sorting are determined as target indexes, where N is a positive integer.

The number of N is not limited in this disclosure, and may be set according to actual situations. For example: n may be set to 200; also for example: n may be set to 300.

For example, for N of 300, the plurality of candidate indexes may be sorted in descending order of similarity, and the candidate indexes of the top 300 of the sorting are determined as target indexes according to the sorting result.

In the embodiment of the disclosure, after the target index is acquired, the candidate commodity associated with the target index can be used as the target commodity, namely, recall of the commodity associated with the user query word is realized.

Further, by the commodity recall method provided by the disclosure, an online commodity recall system can be deployed, namely a model service system and a vector index service system can be deployed, when the commodity recall system receives a user query word, a first target vector corresponding to the user query word is calculated in real time through the model service system, a second target vector corresponding to a candidate commodity is calculated through the model service system, and commodity recall is realized through the vector index service system.

In the embodiment of the disclosure, the user query word is acquired and input to the target double-tower model, the first target vector corresponding to the user query word is determined based on the user query word through the target double-tower model, the second target vector corresponding to the candidate commodity is determined through the target double-tower model, the index set is established according to the second target vector, the candidate index in the index set is searched according to the first target vector, the target commodity associated with the user query word is determined, and the target commodity associated with the user query word can be determined based on the trained target double-tower model, so that the high efficiency and accuracy of commodity recall are improved.

In order to implement the training method of the double-tower model according to the first embodiment of the present disclosure, the present disclosure proposes a training device of the double-tower model, and fig. 6 is a schematic structural diagram of the training device of the double-tower model according to an embodiment of the present disclosure. As shown in fig. 6, the training apparatus 600 of the twin tower model includes:

a first obtaining module 610, configured to obtain a plurality of positive and negative sample pairs, where each positive and negative sample pair includes a positive sample and a negative sample corresponding to a same sample query word, where the positive sample further includes first commodity information, and the negative sample further includes second commodity information;

a second obtaining module 620, configured to input the positive and negative pairs of samples into a double-tower model for training, and obtain a first superparameter corresponding to the positive sample and a second superparameter corresponding to the negative sample, where the first superparameter and the second superparameter are used to obtain an objective function that optimizes model parameters of the double-tower model;

a third obtaining module 630, configured to obtain an objective function for optimizing model parameters of the dual-tower model according to the positive sample and the first super-parameter, and the negative sample and the second super-parameter;

and the training module 640 is configured to optimize the model parameters of the double-tower model according to the objective function and return to continue training until the training end condition is satisfied, thereby obtaining a target double-tower model, where the target double-tower model is used to determine a mapping relationship between the query word and the commodity information.

In one embodiment of the present disclosure, the second obtaining module 620 is further configured to: inputting the positive and negative sample pairs into the double-tower model, respectively acquiring sample word vectors of the sample query words by a user tower in the double-tower model, and acquiring first sample vectors of the first commodity information and second sample vectors of the second commodity information by a commodity tower in the double-tower model; and acquiring a second super parameter corresponding to the negative sample based on the sample word vector, the first sample vector and the second sample vector.

In one embodiment of the present disclosure, the second obtaining module 620 is further configured to: acquiring a first similarity between the first sample vector and the second sample vector; and acquiring a second learning parameter and a third learning parameter, and acquiring a second super parameter corresponding to the negative sample according to the second learning parameter, the third learning parameter and the first similarity.

In one embodiment of the present disclosure, the process of obtaining the first super parameter includes: and inputting the positive and negative sample pairs into a double-tower model for training, and acquiring a preset first learning parameter as the first super parameter in the training process.

In one embodiment of the present disclosure, the third obtaining module 630 is further configured to: obtaining a second similarity between the sample word vector and the first sample vector, and obtaining a third similarity between the sample word vector and the second sample vector; acquiring a first objective function and a second objective function of the double-tower model according to the first similarity, the second similarity, the third similarity, the first super-parameter and the second super-parameter; and acquiring an objective function of the double-tower model according to the first objective function and the second objective function.

In one embodiment of the present disclosure, the acquiring of the first objective function includes: and acquiring the first objective function according to the second similarity, the first super-parameter, the third similarity and the second super-parameter.

In one embodiment of the present disclosure, the obtaining of the second objective function includes: and acquiring the second objective function according to the first similarity, the first super-parameter, the second similarity and the second super-parameter.

In one embodiment of the present disclosure, the acquiring of the first objective function includes: and acquiring the first objective function according to the second similarity, the third similarity and the second super parameter.

In one embodiment of the present disclosure, the obtaining of the second objective function includes: and acquiring the second objective function according to the first similarity, the first super-parameter and the second similarity.

It should be noted that the explanation of the embodiment of the training method of the dual-tower model in the first aspect is also applicable to the training device of the dual-tower model in the embodiment of the disclosure, and the specific process is not repeated here.

In order to implement the commodity recall method according to the second embodiment of the present invention, a commodity recall device is provided in the present disclosure, and fig. 7 is a schematic structural diagram of the commodity recall device according to an embodiment of the present invention. As shown in fig. 7, the commodity recall device 700 includes:

the obtaining module 710 is configured to obtain a user query word, input the user query word to a target double-tower model, and determine a first target vector corresponding to the user query word based on the user query word through the target double-tower model;

a first determining module 720, configured to determine, according to the target double-tower model, a second target vector corresponding to a candidate commodity, and establish an index set according to the second target vector;

a second determining module 730, configured to retrieve candidate indexes in the index set according to the first target vector, and determine a target commodity associated with the user query term, where the target twin-tower model is a model trained by using the method according to any one of the first aspects.

In one embodiment of the present disclosure, the second determining module 730 is further configured to: obtaining the similarity between the first target vector and the candidate index; and determining a target index from the candidate indexes according to the similarity, and taking the candidate commodity associated with the target index as a target commodity.

In one embodiment of the present disclosure, the second determining module 730 is further configured to: and if the similarity between the first target vector and the candidate index is greater than the similarity threshold, determining the candidate index as the target index.

In one embodiment of the present disclosure, the second determining module 730 is further configured to: and sequencing the candidate indexes according to the similarity descending order, and determining N candidate indexes before sequencing as the target indexes, wherein N is a positive integer.

The explanation of the embodiment of the commodity recall method in the second aspect is also applicable to the commodity recall device in the embodiment of the present disclosure, and the specific process is not repeated here.

As shown in fig. 8, a block diagram of an electronic device of a training method or commodity recall method of a dual tower model according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile apparatuses, such as smart voice interaction devices, personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing apparatuses. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the electronic device includes: one or more processors 801, memory 802, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor 801 may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of a GUI on an external input/output device, such as a display device coupled to an interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 801 is illustrated in fig. 8.

Memory 802 is a non-transitory computer-readable storage medium provided by the present disclosure. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the training method of the dual tower model provided by the embodiment of the first aspect of the present disclosure or the commodity recall method provided by the embodiment of the second aspect. The non-transitory computer readable storage medium of the present disclosure stores computer instructions for causing a computer to perform the training method of the double tower model provided by the embodiments of the first aspect of the present disclosure or the merchandise recall method provided by the embodiments of the second aspect.

The memory 802 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the training method of the double-tower model provided by the embodiment of the first aspect of the present disclosure or the commodity recall method provided by the embodiment of the second aspect of the present disclosure. The processor 801 executes various functional applications of the server and data processing, i.e., implements the training method or commodity recall method of the two-tower model in the above-described method embodiment, by running non-transitory software programs, instructions, and modules stored in the memory 802.

Memory 802 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device of the training method or the commodity recall method of the twin tower model, and the like. In addition, memory 802 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 802 may optionally include memory remotely located with respect to processor 801, which may be connected to the electronics of the training method or commodity recall method of the dual tower model via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the training method or the commodity recall method of the double-tower model may further include: an input device 803 and an output device 804. The processor 801, memory 802, input devices 803, and output devices 804 may be connected by a bus or other means, for example in fig. 8.

The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device of the training method or merchandise recall method of the twin tower model, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output device 804 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

To achieve the above embodiments, the present disclosure further proposes a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a training method of a double tower model as proposed by the foregoing first aspect embodiments of the present disclosure or a commodity recall method as proposed by the second aspect embodiments.

To achieve the above embodiments, the present disclosure proposes a computer program product comprising a computer program which, when executed by a processor, implements the method for training a dual tower model provided by the foregoing first aspect embodiments of the present disclosure or the method for recall of goods provided by the second aspect embodiments.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

In the description of this specification, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present disclosure, the meaning of "a plurality" is at least two, such as two, three, etc., unless explicitly specified otherwise.

Although embodiments of the present disclosure have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the present disclosure, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the present disclosure.

Claims

1. A method of training a twin tower model, the method comprising:

acquiring a plurality of positive and negative sample pairs, wherein each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, the positive sample further comprises first commodity information, and the negative sample further comprises second commodity information;

inputting the positive and negative sample pairs into a double-tower model for training, and obtaining a first super-parameter corresponding to the positive sample and a second super-parameter corresponding to the negative sample, wherein the first super-parameter and the second super-parameter are used for obtaining an objective function for optimizing model parameters of the double-tower model;

acquiring an objective function of the model parameters for optimizing the double-tower model according to the positive sample and the first super parameter, and the negative sample and the second super parameter;

and optimizing the model parameters of the double-tower model according to the objective function, and returning to continue training until the training ending condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining the mapping relation between the query word and commodity information.

2. The method of claim 1, wherein the second hyper-parameter acquisition process comprises:

inputting the positive and negative sample pairs into the double-tower model, respectively acquiring sample word vectors of the sample query words by a user tower in the double-tower model, and acquiring first sample vectors of the first commodity information and second sample vectors of the second commodity information by a commodity tower in the double-tower model;

and acquiring a second super parameter corresponding to the negative sample based on the sample word vector, the first sample vector and the second sample vector.

3. The method according to claim 2, wherein the obtaining the second hyper-parameter corresponding to the negative sample based on the sample word vector, the first sample vector, and the second sample vector comprises:

acquiring a first similarity between the first sample vector and the second sample vector;

and acquiring a second learning parameter and a third learning parameter, and acquiring a second super parameter corresponding to the negative sample according to the second learning parameter, the third learning parameter and the first similarity.

4. The method according to claim 1 or 2, wherein the first hyper-parameter acquisition process comprises:

And inputting the positive and negative sample pairs into a double-tower model for training, and acquiring a preset first learning parameter as the first super parameter in the training process.

5. A method according to claim 3, wherein said obtaining an objective function for optimizing model parameters of the twin tower model based on the positive sample and the first super parameter, and the negative sample and the second super parameter, comprises:

obtaining a second similarity between the sample word vector and the first sample vector, and obtaining a third similarity between the sample word vector and the second sample vector;

acquiring a first objective function and a second objective function of the double-tower model according to the first similarity, the second similarity, the third similarity, the first super-parameter and the second super-parameter;

and acquiring an objective function of the double-tower model according to the first objective function and the second objective function.

6. The method of claim 5, wherein the process of obtaining the first objective function comprises:

and acquiring the first objective function according to the second similarity, the first super-parameter, the third similarity and the second super-parameter.

7. The method of claim 5, wherein the second objective function obtaining process includes:

and acquiring the second objective function according to the first similarity, the first super-parameter, the second similarity and the second super-parameter.

8. The method of claim 5, wherein the process of obtaining the first objective function comprises:

and acquiring the first objective function according to the second similarity, the third similarity and the second super parameter.

9. The method of claim 5, wherein the process of obtaining the second objective function comprises:

and acquiring the second objective function according to the first similarity, the first super-parameter and the second similarity.

10. A method of recall of an article, the method comprising:

acquiring a user query word, inputting the user query word into a target double-tower model, and determining a first target vector corresponding to the user query word based on the user query word through the target double-tower model;

determining a second target vector corresponding to the candidate commodity through the target double-tower model, and establishing an index set according to the second target vector;

Searching candidate indexes in the index set according to the first target vector to determine target commodities associated with the user query word, wherein the target double-tower model is a model trained by the method according to any one of claims 1-9.

11. The method of claim 10, wherein the retrieving candidate indices of the index set from the first target vector to determine target items associated with the user query term comprises:

obtaining the similarity between the first target vector and the candidate index;

and determining a target index from the candidate indexes according to the similarity, and taking the candidate commodity associated with the target index as a target commodity.

12. The method of claim 11, wherein said determining a target index from candidate indexes based on said similarity comprises:

and if the similarity between the first target vector and the candidate index is greater than the similarity threshold, determining the candidate index as the target index.

13. The method of claim 11, wherein said determining a target index from candidate indexes based on said similarity comprises:

And sequencing the candidate indexes according to the similarity descending order, and determining N candidate indexes before sequencing as the target indexes, wherein N is a positive integer.

14. A training apparatus for a twin tower model, the apparatus comprising:

the first acquisition module is used for acquiring a plurality of positive and negative sample pairs, wherein each positive and negative sample pair comprises a positive sample and a negative sample corresponding to the same sample query word, the positive sample further comprises first commodity information, and the negative sample further comprises second commodity information;

the second acquisition module is used for inputting the positive and negative sample pairs into a double-tower model for training, and acquiring a first super-parameter corresponding to the positive sample and a second super-parameter corresponding to the negative sample, wherein the first super-parameter and the second super-parameter are used for acquiring an objective function for optimizing model parameters of the double-tower model;

the third acquisition module is used for acquiring an objective function for optimizing model parameters of the double-tower model according to the positive sample, the first super-parameter, the negative sample and the second super-parameter;

and the training module is used for optimizing the model parameters of the double-tower model according to the objective function and returning to continue training until the training ending condition is met to obtain a target double-tower model, wherein the target double-tower model is used for determining the mapping relation between the query word and the commodity information.

15. A merchandise recall apparatus, the apparatus comprising:

the acquisition module is used for acquiring a user query word, inputting the user query word into a target double-tower model, and determining a first target vector corresponding to the user query word based on the user query word through the target double-tower model;

the first determining module is used for determining a second target vector corresponding to the candidate commodity through the target double-tower model, and establishing an index set according to the second target vector;

and the second determining module is used for searching candidate indexes in the index set according to the first target vector to determine target commodities associated with the user query word, wherein the target double-tower model is a model trained by the method according to any one of claims 1-9.

16. An electronic device comprising a memory and a processor;

wherein the memory and the processor are in communication with each other via an internal connection, the memory being for storing instructions, the processor being for executing the memory-stored instructions and, when the processor executes the memory-stored instructions, causing the processor to perform the method of any one of claims 1-9 or 10-13.

17. Computer readable storage medium a computer readable storage medium storing a computer program which, when run on a computer, performs the method of any one of claims 1-9 or claims 10-13.

18. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-9 or claims 10-13.