CN113868523A

CN113868523A - Recommendation model training method, electronic device and storage medium

Info

Publication number: CN113868523A
Application number: CN202111132414.6A
Authority: CN
Inventors: 王国瑞
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2021-09-26
Filing date: 2021-09-26
Publication date: 2021-12-31

Abstract

The embodiment of the application discloses a recommendation model training method, electronic equipment and a storage medium, and is applied to the technical field of machine learning. The method comprises the following steps: the method comprises the steps of obtaining a sample feature set, combining and inputting the sample feature set into an initial prediction model to obtain a predicted click rate and a predicted conversion rate of a sample user for a sample recommendation object, generating a click loss function based on a click label and the predicted click rate, generating a click conversion loss function based on the click label, a conversion label, the predicted click rate and the predicted conversion rate, obtaining a weight loss function according to the click loss function, the click conversion loss function and model parameters, generating a click weight and a click conversion weight based on the weight loss function, obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting model parameters based on the target loss function to obtain a target prediction model. By adopting the embodiment of the application, the prediction accuracy of the target prediction model for the click rate and the conversion rate can be improved.

Description

Recommendation model training method, electronic device and storage medium

Technical Field

The present application relates to the field of machine learning technologies, and in particular, to a recommendation model training method, an electronic device, and a storage medium.

Background

At present, the interest degree of a user for a recommended object needs to be predicted in a recommended scene so as to push the recommended object interested by the user to a user terminal, the prediction accuracy is directly reflected in the click rate and the conversion rate of the user for the recommended object, and accordingly, accurate push in the recommended scene can be achieved by predicting the click rate and the conversion rate. The existing prediction method generally obtains a target prediction model for predicting click rate and conversion rate by jointly modeling a click rate prediction task and a conversion rate prediction task and training by combining a loss function of the click rate prediction task and a loss function of the conversion rate prediction task. However, since the two loss functions are combined, the corresponding weight parameters are difficult to adjust, which may affect the training effect of the model, thereby resulting in low prediction accuracy. Therefore, how to improve the prediction accuracy of the click rate and the conversion rate becomes a problem to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a recommendation model training method, electronic equipment and a storage medium, which can improve the prediction accuracy of a target prediction model for click rate and conversion rate.

In one aspect, an embodiment of the present application provides a recommendation model training method, including:

acquiring a sample feature set; the sample feature set comprises user attribute information of a sample user, object attribute information of a sample recommended object, a click label and a conversion label of the sample user for the sample recommended object;

inputting the sample feature set into an initial prediction model, and acquiring the predicted click rate and the predicted conversion rate of a sample user for a sample recommendation object based on the initial prediction model;

generating a click loss function based on the click label and the predicted click rate, and generating a click conversion loss function based on the conversion label, the predicted click rate and the predicted conversion rate;

obtaining a weight loss function according to the click loss function, the click conversion loss function and model parameters of the initial prediction model, and generating a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function;

and obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting the model parameters of the initial prediction model based on the target loss function to obtain a target prediction model.

In one aspect, an embodiment of the present application provides a data recommendation method based on a recommendation model, where the method includes:

acquiring target user attribute information of a predicted user and target object attribute information of an object to be recommended;

inputting the target user attribute information and the target object attribute information into a target prediction model;

generating a target prediction click rate and a target prediction conversion rate of a prediction user for an object to be recommended in a target prediction model;

acquiring interest scores of the prediction users for the objects to be recommended based on the target prediction click rate and the target prediction conversion rate;

and if the interest score is larger than the interest score threshold, pushing the object to be recommended to a user terminal corresponding to the predicted user.

In one aspect, an embodiment of the present application provides a recommendation model training device, where the device includes:

the acquisition module is used for acquiring a sample feature set; the sample feature set comprises user attribute information of a sample user, object attribute information of a sample recommended object, a click label and a conversion label of the sample user for the sample recommended object;

the obtaining module is further used for inputting the sample feature set into an initial prediction model, and obtaining a predicted click rate and a predicted conversion rate of a sample user for a sample recommendation object based on the initial prediction model;

the generating module is used for generating a click loss function based on the click label and the predicted click rate and generating a click conversion loss function based on the conversion label, the predicted click rate and the predicted conversion rate;

the generating module is further used for obtaining a weight loss function according to the click loss function, the click conversion loss function and the model parameters of the initial prediction model, and generating a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function;

and the correction module is used for obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting the model parameters of the initial prediction model based on the target loss function to obtain a target prediction model.

In one aspect, an embodiment of the present application provides a data recommendation device based on a recommendation model, where the device includes:

the acquisition module is used for acquiring target user attribute information of a predicted user and target object attribute information of an object to be recommended;

the input module is used for inputting the target user attribute information and the target object attribute information into the target prediction model;

the generating module is used for generating a target prediction click rate and a target prediction conversion rate of a prediction user aiming at an object to be recommended in a target prediction model;

the obtaining module is further used for obtaining the interest scores of the prediction users for the objects to be recommended based on the target prediction click rate and the target prediction conversion rate;

and the pushing module is used for pushing the object to be recommended to the user terminal corresponding to the predicted user if the interest score is larger than the interest score threshold.

In one aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to perform some or all of the steps in the above method.

In one aspect, the present application provides a computer-readable storage medium, which stores a computer program, where the computer program includes program instructions, and the program instructions, when executed by a processor, are used to perform some or all of the steps of the above method.

Accordingly, according to an aspect of the present application, there is provided a computer program product or computer program comprising program instructions stored in a computer readable storage medium. The processor of the computer device reads the program instructions from the computer-readable storage medium, and the processor executes the program instructions to cause the computer device to execute the recommendation model training method and/or the recommendation model-based data recommendation method provided above.

In the embodiment of the application, a sample feature set can be obtained, the sample feature set is input into an initial prediction model, the predicted click rate and the predicted conversion rate of a sample user for a sample recommendation object are obtained based on the initial prediction model, a click loss function is generated based on the click label and the predicted click rate, a click conversion loss function is generated based on the conversion label, the predicted click rate and the predicted conversion rate, a weight loss function is obtained according to the click loss function, the click conversion loss function and model parameters of the initial prediction model, a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function are generated based on the weight loss function, a target loss function is obtained based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and model parameters of the initial prediction model are corrected based on the target loss function, and obtaining a target prediction model. By implementing the method, the click weight and the click conversion weight can be dynamically adjusted in the model training process so as to keep the balance of two loss functions in the target loss function, so that the training effect of the target prediction model is better, and the accuracy of the target prediction model for predicting the click rate and the conversion rate is higher.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an application architecture according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a recommendation model training method according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a recommendation model training method according to an embodiment of the present application;

fig. 4 is a scene schematic diagram for obtaining attribute features according to an embodiment of the present application;

fig. 5 is a scene schematic diagram for acquiring a fusion attribute feature according to an embodiment of the present application;

fig. 6 is a schematic view of a scenario of model training provided in an embodiment of the present application;

fig. 7 is a schematic flowchart of a data recommendation method based on a recommendation model according to an embodiment of the present application;

fig. 8a is a schematic diagram of a push scenario for a predicted user according to an embodiment of the present application;

fig. 8b is a schematic diagram of a push scenario for a predicted user according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a recommended model training apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a data recommendation device based on a recommendation model according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

The recommendation model training method provided by the embodiment of the application can be realized in electronic equipment, and the electronic equipment can be a server or terminal equipment. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like.

In some embodiments, please refer to fig. 1, where fig. 1 is a schematic diagram of an application architecture provided in the present application, and the recommendation model training method provided in the present application can be executed through the application architecture. As shown in FIG. 1, FIG. 1 may include an electronic device having an initial predictive model deployed therein.

Wherein, the initial prediction model can comprise a sharing layer, a prediction layer and a weight adjusting layer; specifically, (1) the sharing layer may include a feature processing layer and a predicted feature generation layer, the feature processing layer may be configured to process a sample feature set input by the electronic device to obtain a fusion attribute feature, the predicted feature generation layer may include a click weight prediction network, a transformation weight prediction network, and N predicted feature generation networks, and the electronic device may utilize the click weight prediction network, the transformation weight prediction network, and the N predicted feature generation networks in the predicted feature generation layer and obtain a click prediction feature and a transformation prediction feature based on the fusion attribute feature;

(2) the electronic equipment can obtain a predicted click rate in the prediction layer by using the click prediction network and based on click prediction characteristics, obtain a predicted conversion rate by using the conversion prediction network and based on conversion prediction characteristics, further generate a click loss function based on the obtained predicted click rate and click labels in the sample characteristic set, and generate a click conversion loss function based on the obtained predicted conversion rate and the conversion labels in the sample characteristic set;

(3) the weight adjusting layer may be configured to obtain a weight loss function by using a click loss function and a click conversion loss function as well as model parameters of the initial prediction model in a training process of the initial prediction model, and generate a click conversion weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function, that is, dynamically adjust an initial click weight corresponding to the click loss function (a click weight used in previous model training) and an initial click conversion weight corresponding to the click conversion loss function (a click conversion weight used in previous model training (e.g., the ith time)) based on the weight loss function to obtain a click weight and a click conversion weight, respectively weigh the click loss function by using the click weight to obtain a weighted click loss function, and weigh the click conversion loss function by using the click conversion weight to obtain a weighted click conversion loss function, and summing the weighted click loss function and the weighted click conversion loss function to obtain a target loss function of the model training (for example, the (i + 1) th time), subsequently correcting model parameters of the initial prediction model based on the target loss function to obtain a target prediction model through training, and executing the data recommendation method based on the recommendation model by the electronic equipment through the target prediction model. Subsequently, the electronic device may perform a prediction task in a recommendation scene by using the target prediction model, and perform the recommendation task in the recommendation scene based on a prediction result obtained by the prediction task, where it can be understood that the target prediction model used in the application process includes a shared layer and a prediction layer portion of the initial prediction model. In addition, the Click Rate is also called CTR (Click-through Rate) and the conversion Rate is also called CVR (conversion rates).

It should be understood that fig. 1 merely illustrates a possible application architecture of the present application, and does not limit the specific architecture of the present application, that is, the present application may also provide other forms of application architectures.

Optionally, in some embodiments, the electronic device may execute the recommendation model training method according to an actual service requirement to improve the prediction accuracy of the click rate and the conversion rate. The technical scheme of the application can be applied to the prediction task under any recommendation scene, namely, the electronic equipment can utilize the related information of the sample user and the sample recommendation object (such as the user attribute information of the sample user, the object attribute information of the sample recommendation object, the click label and the conversion label of the sample user for the sample recommendation object, and the like), train the initial prediction model based on the model training mode contained in the technical scheme of the application, subsequently execute the data recommendation method based on the recommendation model to obtain the target user attribute information of the prediction user and the target object attribute information of the object to be recommended, predict the target user attribute information of the prediction user and the target object attribute information of the object to be recommended by utilizing the target prediction model obtained by training to obtain the target prediction click rate and the target prediction conversion rate of the prediction user for the object to be recommended, and then, the interest scores of the prediction users for the objects to be recommended can be determined by combining the target prediction click rate and the target prediction conversion rate, and accurate pushing can be realized based on the interest scores.

For example, the method can be applied to a recommendation scene of a live broadcast product, a sample recommendation object (object to be recommended) at this time can be an online anchor, a sample user (prediction user) can be a user who clicks the online anchor on a live broadcast list interface, the electronic device can train the model by using the relevant information of the sample user and the sample recommendation object, subsequently determine the interest score of the prediction user for the online anchor by using the prediction result of the target prediction model, and push the online anchor to the user terminal of the prediction user based on the interest score. For another example, the method may also be applied to a recommendation scene of a news product, the sample recommendation object (object to be recommended) at this time may be an information article published, the sample user (prediction user) may be a user who clicks to read the article on an information list interface, the electronic device may train the model by using the relevant information of the sample user and the sample recommendation object, subsequently determine the interest score of the prediction user for the information article by using the prediction result of the target prediction model, and push the information article to the user terminal of the prediction user based on the interest score. For another example, the method can also be applied to a recommendation scene of e-commerce products, where the sample recommendation object at this time can be a purchased commodity, the sample user (predicted user) can be a user clicking the commodity on a commodity list interface, the interest score of the predicted user for the commodity is determined by using the prediction result of the target prediction model, and the commodity is pushed to the user terminal of the predicted user based on the interest score.

Optionally, data related to the present application, such as the predicted click rate and the predicted conversion rate, may be stored in a database, or may be stored in a block chain, such as by a block chain distributed system, which is not limited in the present application.

It is to be understood that the foregoing scenarios are only examples, and do not constitute a limitation on application scenarios of the technical solutions provided in the embodiments of the present application, and the technical solutions of the present application may also be applied to other scenarios. For example, as can be known by those skilled in the art, with the evolution of system architecture and the emergence of new service scenarios, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems.

Based on the above description, the present application provides a recommendation model training method, which may be performed by the above-mentioned electronic device. Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a recommended model training method according to an embodiment of the present disclosure. As shown in fig. 2, a flow of the recommendation model training method according to the embodiment of the present application may include the following steps:

s201, obtaining a sample feature set.

The sample feature set comprises user attribute information of a sample user, object attribute information of a sample recommendation object, a click label of the sample user for the sample recommendation object and a conversion label.

Optionally, the user attribute information of the sample user may have one or more attributes, and may be used to characterize user characteristics, the object attribute information of the sample recommendation object may have one or more attributes, and may be used to characterize object characteristics of the recommendation object, and the user attribute information of the sample user and the object attribute information of the sample recommendation object may be set by the relevant service personnel according to the actual application scenario, which is not limited herein. For example, in the case of a recommended scene of a live broadcast product, the user attribute information of the sample user may be a basic portrait attribute (including age, gender, academic calendar, and the like), a statistical attribute (a time length of live broadcast viewing of the sample user within one day or three days, a number of main broadcast flowers of the sample user within three days or one week, a number of main broadcast prizes of the sample user within three days or one week, and the like), and the like; the object attribute information of the online anchor, i.e. the sample recommendation object, may be a basic portrait attribute (including age, gender, academic calendar, etc.), a real-time attribute (a total number of viewing users or a maximum number of viewing users in a live broadcast room in one day or three days, a type of the live broadcast room, a playing time length in one day in the live broadcast room, etc.), and the like.

Optionally, the click label and the conversion label of the sample user for the sample recommended object are determined according to whether the sample user clicks the sample recommended object or not and the conversion behavior. That is, if the sample user clicks on the sample recommendation object but does not have a conversion behavior, the corresponding click label is 1 and the corresponding conversion label is 0; if the sample user clicks the sample recommendation object and has a conversion behavior, the corresponding click label is 1 and the corresponding conversion label is 1; and if the sample user does not click on the sample recommendation object, the corresponding click label is 0, and the corresponding conversion label is 0. For example, in a recommended scene of a live product, the click behavior refers to that the user clicks into a live room of an online anchor, and the conversion behavior refers to that the user watches for a certain time (for example, 30s) in the live room of the online live.

S202, inputting the sample feature set into an initial prediction model, and obtaining the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model.

In one possible implementation, the electronic device may input the sample feature set into an initial prediction model, generate, in the initial prediction model, a user attribute feature corresponding to the user attribute information using user attribute information of sample users in the sample feature set, and generate an object attribute feature corresponding to the object attribute information using object attribute information of sample recommended objects in the sample feature set. The user attribute feature and the object attribute feature may be generated by using a feature processing layer in a sharing layer of the initial prediction model.

Optionally, for example, in the case of generating the user attribute feature, the electronic device may specifically generate the user attribute feature by using a feature processing layer in a sharing layer of the initial prediction model, that is, by using the feature processing layer to perform Hot-independent coding (One-Hot coding) on the user attribute information, to obtain the user attribute feature corresponding to the user attribute information. For example, the user attribute information includes an age attribute, and the age attribute is classified into [ < 18,19-30,31-40, 41-50,51-60, > 60], and if the age of the sample user is 24, the attribute feature corresponding to the age attribute obtained by performing the hot unique encoding may be represented as [0,1,0,0,0,0 ]. Therefore, the one or more attributes included in the user attribute information may generate corresponding user attribute features, and the user attribute features may be a feature matrix composed of one or more attribute features derived from the attributes included in the user attribute information. Optionally, the specific manner of generating the object attribute feature may be the same as the specific manner of generating the user attribute feature, and details are not described here.

In some embodiments, the electronic device may specifically obtain the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model by using a feature processing layer in a sharing layer of the initial prediction model to obtain a fusion attribute feature of the user attribute feature and the object attribute feature, and obtain the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the fusion attribute feature. The electronic device may specifically acquire the fusion attribute feature of the user attribute feature and the object attribute feature by using a feature processing layer in a sharing layer of the initial prediction model, where the user attribute feature and the object attribute feature are spliced by the feature processing layer to obtain the fusion attribute feature, and the fusion attribute feature may be a feature vector obtained by splicing a feature matrix based on the user attribute feature and a feature matrix based on the object attribute feature. For example, the user attribute feature is represented as [0,1,0,0,0,0], [0,0,0,1,0,0], the object attribute feature is represented as [0,0,0,1,0,0], [0,1,0,0,0,0], and thus the resulting fused attribute feature may be represented as [0,1,0,0,0,0,0, 1,0,0,0,0,0, 0.

In one possible implementation, the electronic device may specifically obtain the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the fusion attribute feature by using a prediction feature generation layer in a sharing layer of an initial prediction model and obtaining a click prediction feature and a conversion prediction feature based on the fusion attribute feature, and obtain the predicted click rate based on the click prediction feature and the predicted conversion rate based on the conversion prediction feature in a prediction layer of the initial prediction model.

Optionally, the prediction feature generation layer in the sharing layer of the initial prediction model may include a click weight prediction network, a conversion weight prediction network, and N prediction feature generation networks; the electronic device may specifically input the fusion attribute feature into the prediction feature generation layer, generate the click prediction feature using the click weight prediction network and the N prediction feature generation networks in the prediction feature generation layer, and generate the conversion prediction feature using the conversion weight prediction network and the N prediction feature generation networks. The specific implementation process of generating the click prediction feature and generating the conversion prediction feature may be referred to the following related description in step S302.

And the prediction layer in the initial prediction model can comprise a click prediction network constructed for predicting click rate and a conversion prediction network constructed for predicting conversion rate; the electronic device may obtain the predicted click rate based on the click prediction feature and the predicted conversion rate based on the conversion prediction feature in the prediction layer of the initial prediction model, specifically, the click prediction feature is input to a click prediction network in the prediction layer to obtain the predicted click rate, and the conversion prediction feature is input to the conversion prediction network in the prediction layer to obtain the predicted conversion rate. The specific implementation process of obtaining the predicted click rate and obtaining the predicted conversion rate can be referred to the following related description in step S302.

S203, generating a click loss function based on the click label and the predicted click rate, and generating a click conversion loss function based on the click label, the conversion label, the predicted click rate and the predicted conversion rate.

In one possible implementation, the electronic device may utilize a plurality of sample feature sets as a sample data set, determine a click loss function based on a predicted click rate obtained from each sample feature set in the sample data set, and determine a click conversion loss function based on the predicted click rate and the predicted conversion rate obtained from each sample feature set in the sample data set(ii) a The electronic device may generate a click loss function for the initial prediction model based on the click tags and the predicted click rate, the click loss function L_ctr(t) may be:

wherein, N is the number of sample feature sets in the sample data set, x_i(ctr)Represents the click prediction characteristic obtained according to the ith sample characteristic set in the sample data set, y_iClick-through tag, θ, representing the ith sample feature set_ctrNetwork parameters representing a click prediction network, l represents a cross entropy loss function, f₁(v₁,u₁) Representing u according to input₁And v₁The corresponding predicted click rate is generated, t may represent the t-th training process for the initial prediction model, i.e., t may be the number of times of the t-th training or the time (the time is the time when training is performed compared to the 0-th model (the time indicated at this time is 0)), and the following description will take t as the training number.

And, the electronic device may generate a click conversion loss function for the initial prediction model based on the click label, the conversion label, the predicted click rate, and the predicted conversion rate, the click conversion loss function L_ctcvr(t) may be:

wherein, N is the number of sample feature sets in the sample data set, x_i(ctr)Represents the click prediction characteristic, x, obtained according to the ith sample characteristic set in the sample data set_i(cvr)Representing the transformed predicted features, y, from the ith sample feature set in the sample data set_iClick-tag, z, representing the ith sample feature set_iA transformation tag, θ, representing the ith sample feature set_ctrNetwork parameter, θ, representing a click-to-predict network_cvrRepresenting transformation prediction netNetwork parameters of the network, l representing a cross entropy loss function, f₁(v₁,u₁) Representing u according to input₁And v₁Generating a corresponding predicted click-through rate, f₂(v₂U2) represents u2 and v according to the input₂A corresponding predicted conversion rate is generated, and t may represent the t training process for the initial prediction model, i.e., t may be the number of times or the time of the t training (the time is compared with the time of the 0 training (assuming that the time indicated at this time is 0)).

In addition, in the training process, the training targets are multi-task targets, namely, a click rate prediction task and a conversion rate prediction task are trained, the number of click labels of sample users for sample recommendation objects is more than 1, and the number of conversion labels is less than 1, so that the difference between a click sample space and a conversion sample space can be caused, and further the training effect is poor.

Optionally, the network parameters of the click prediction network and the network parameters of the conversion prediction network may be model parameters in the initial prediction model.

S204, obtaining a weight loss function according to the click loss function, the click conversion loss function and the model parameters of the initial prediction model, and generating a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function.

In a possible embodiment, the electronic device may specifically obtain the weight loss function according to the click loss function, the click conversion loss function, and the model parameter of the initial prediction model, where the obtaining of the weight loss function may be by obtaining a first weight function corresponding to the click loss function and a second weight function corresponding to the click conversion loss function, determining the click gradient function according to the first weight function, the click loss function, and the model parameter of the initial prediction model, determining the click conversion gradient function according to the second weight function, the click conversion loss function, and the model parameter of the initial prediction model, and generating the weight loss function according to the click gradient function and the click conversion gradient function.

The first weight function may be used to determine a click weight, the second weight function may be used to determine a click conversion weight, and the electronic device may determine a model loss function for training the initial prediction model according to the first weight function, the click loss function, the second weight function, and the click conversion loss function. The first weighting function and the second weighting function can be set according to experience values by related business personnel.

For example, let the first weighting function be W_ctr(t) the second weight function is W_cvr(t), the model loss function L thus determined_task(t) can be expressed as:

L_task(t)＝W_ctr(t)L_ctr(t)+W_cvr(t)L_ctcvr(t)

wherein t may represent the tth training process for the initial prediction model; l is_ctr(t) represents the click loss function, L_ctcvr(t) represents a click conversion loss function.

In one possible embodiment, the electronic device may specifically generate the click weight corresponding to the click loss function and the click conversion weight corresponding to the click conversion loss function based on the weight loss function, where the first weight adjustment function corresponding to the first weight function and the second weight adjustment function corresponding to the second weight function are obtained based on the weight adjustment layer of the initial prediction model and by using the weight loss function, the click weight is obtained according to the first weight function and the first weight adjustment function, and the click conversion weight is obtained according to the second weight function and the second weight adjustment function.

S205, obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting model parameters of the initial prediction model based on the target loss function to obtain a target prediction model.

In one possibilityIn the embodiment (b), in one round of training for the initial prediction model, the click weight is set to W_ctrClick conversion weight of W_ctcvrThe click loss function is L_ctrClick conversion loss function of L_ctcvrAnd the target loss function L thus obtained_taskComprises the following steps:

L_task＝W_ctrL_ctr+W_ctcvrL_ctcvr

it can be understood that, when the click weight is obtained according to the first weight function and the click conversion weight is obtained according to the second weight function, the electronic device may replace the first weight function with the click weight and replace the second weight function with the click conversion weight in the model loss function, where the model loss function is the target loss function.

Therefore, the electronic device may modify the model parameters of the initial prediction model based on the target loss function at this time, and the model obtained by modifying the model parameters in this round may participate in the next round of model training until the model converges, so that the corresponding target prediction model may be obtained. Subsequently, the target prediction model may be applied to a recommendation scene of an object to be recommended, and for example, may be specifically used to predict a target click rate and a target conversion rate according to target user attribute information of a predicted user and target object attribute information of an object to be recommended (such as an online anchor, an information article, or the like), and then the electronic device may implement a recommendation task for the predicted user and the object to be recommended based on the target click rate and the target conversion rate of the predicted user for the object to be recommended, for example, the recommendation task may be that the electronic device determines the object to be recommended to be pushed based on the target click rate and the target conversion rate obtained by prediction, and pushes the object to be recommended to be pushed to a user terminal of the predicted user, and the predicted user may click to view related information of the pushed object to be recommended (such as a live broadcast corresponding to the online anchor, or details corresponding to the information article, etc.).

In the embodiment of the application, a sample feature set can be obtained, the sample feature set is input into an initial prediction model, the predicted click rate and the predicted conversion rate of a sample user for a sample recommendation object are obtained based on the initial prediction model, a click loss function for the initial prediction model is generated based on a click label and the predicted click rate, a click conversion loss function for the initial prediction model is generated based on the click label, a conversion label, the predicted click rate and the predicted conversion rate, a weight loss function is obtained according to the click loss function, the click conversion loss function and model parameters of the initial prediction model, a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function are generated based on the weight loss function, a target loss function is obtained based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and modifying the model parameters of the initial prediction model based on the target loss function to obtain a target prediction model. By implementing the method provided by the embodiment of the application, the click weight and the click conversion weight can be dynamically adjusted in the model training process so as to keep the balance of two loss functions in the target loss function, so that the training effect of the target prediction model is better, and the accuracy of the target prediction model for predicting the click rate and the conversion rate is higher.

Referring to fig. 3, fig. 3 is a flowchart illustrating a recommendation model training method according to an embodiment of the present application, which can be executed by the above-mentioned electronic device. As shown in fig. 3, a flow of the recommended model training method in the embodiment of the present application may include the following steps:

s301, obtaining a sample feature set. For a specific implementation of step S301, reference may be made to the related description of step S201, which is not described herein again.

S302, inputting the sample feature set into an initial prediction model, and obtaining the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model.

In one possible implementation, the electronic device may generate a user attribute feature corresponding to the user attribute information and generate an object attribute feature corresponding to the object attribute information by using a feature processing layer in a sharing layer of the initial prediction model. Wherein the feature handling layer may comprise an attribute feature generation layer.

Optionally, the electronic device generates the user attribute feature corresponding to the user attribute information by using the feature processing layer, and generates the object attribute feature corresponding to the object attribute information specifically may be that the user attribute feature is generated by using the user attribute information in the attribute feature generation layer, and the object attribute feature is generated by using the object attribute information. For example, in the case of generating the user attribute feature, the electronic device may specifically generate the user attribute feature by using the user attribute information in the attribute feature generation layer by performing hot unique coding on the user attribute information to obtain a user attribute code corresponding to the user attribute information, and constructing an embedded matrix for the user attribute, and determining the user attribute feature by using the user attribute code and the corresponding embedded matrix.

In some embodiments, the specific manner of performing the hot unique encoding on the user attribute information to obtain the user attribute code corresponding to the user attribute information may refer to the related description of step S202; the specific method for determining the user attribute characteristics by using the user attribute code and the corresponding embedded matrix may be to obtain the number of columns where the element of 1 in the user attribute code is located, obtain the corresponding row vector from the embedded matrix according to the number of the columns where the element is located, obtain the number of rows where the row vector is located in the embedded matrix and the number of columns where the row vector is located in the embedded matrix, and use the row vector as the user attribute characteristics. For example, the user attribute information includes an age attribute, and the age attribute is divided into [ < 18,19-30,31-40, 41-50,51-60, > 60], and if the age of the sample user is 24, the attribute feature corresponding to the age attribute obtained by performing the hot-independent encoding may be represented as [0,1,0,0, 0], so that the number of columns where the element of 1 is located is 2, a row vector with the number of rows being 2 is obtained from the embedding matrix, the embedding matrix is set to 6 × 8, and 8 elements included in the row vector are used as the attribute feature corresponding to the age attribute. Therefore, one or more row vectors can be obtained from one or more attributes contained in the user attribute information, and a feature matrix can be formed based on the one or more row vectors, wherein the feature matrix is the user attribute feature. The embedded matrixes respectively corresponding to the multiple attributes can be the same or different, and the size of the embedded matrix respectively corresponding to the multiple attributes and the specific numerical value of each element in the embedded matrix are not limited.

Optionally, the specific manner of generating the object attribute feature may be the same as the specific manner of generating the user attribute feature, and details are not described here. The embedding matrix for the user attribute and the embedding matrix for the object attribute can be set by related service personnel according to experience values, and can also be obtained by model training as model parameters of an initial prediction model.

For example, as shown in fig. 4, fig. 4 is a scene schematic diagram for obtaining attribute features according to an embodiment of the present application, where user attribute information includes a user attribute 1, and an attribute code obtained by performing hot-independent encoding on the user attribute 1 is [0,1,0,0,0]Therefore, the row vector of the 2 nd row is obtained from the corresponding embedded matrix 1 as the attribute feature (V1) corresponding to the user attribute 1, the corresponding attribute features (V1, V2,.. and Vn) are obtained based on the plurality of user attributes (1, 2,.. and N) included in the user attribute information, the user attribute feature is the feature matrix 1 composed of the attribute features (V1, V2,.. and Vn), and the corresponding attribute features (Vn +1, Vn +2,.. and V +1, Vn +2,.. and N) are obtained based on the plurality of object attributes (N +1, N +2,.. and N) included in the object attribute information_N) The object attribute feature is composed of attribute features (Vn +1, Vn +2,.. An.V)_N) The formed characteristic matrix 2 can form a target characteristic matrix by the characteristic matrix 1 and the characteristic matrix 2; alternatively, the target feature matrix may be in two forms, and the two forms of the target feature matrix may be in the forms as characterized by the target feature matrix 1 and the target feature matrix 2 in fig. 4. The object feature matrix 1 in fig. 4 represents the same attribute features by columns, and the object feature matrix 2 in fig. 4 represents the same attribute features by rows.

In one possible implementation, the electronic device may specifically obtain the predicted click rate and the predicted conversion rate of the sample user for the sample recommended object based on the initial prediction model by obtaining a user attribute feature and an object attribute feature, and obtaining the predicted click rate and the predicted conversion rate based on the user attribute feature and the object attribute feature. The electronic equipment obtains the predicted click rate and the predicted conversion rate based on the user attribute features and the object attribute features specifically, feature fusion is carried out on the user attribute features and the object attribute features to obtain fusion attribute features; and obtaining the predicted click rate and the predicted conversion rate based on the fusion attribute characteristics.

In some embodiments, the electronic device performs feature fusion on the user attribute feature and the object attribute feature to obtain a fusion attribute feature, specifically, the user attribute feature and the object attribute feature are spliced to obtain the fusion attribute feature, and the fusion attribute feature may be a feature vector obtained by splicing a feature matrix based on the user attribute feature and a feature matrix based on the object attribute feature; or, the feature processing layer may further include a feature fusion layer, and feature fusion is performed on the user attribute feature and the object attribute feature by using the feature fusion layer in the feature processing layer to obtain a fusion attribute feature.

Optionally, the electronic device performs feature fusion on the user attribute features and the object attribute features by using a feature fusion layer in the feature processing layer to obtain the fusion attribute features, specifically, in the feature fusion layer, the user attribute features and the object attribute features are used as a feature set, feature intersection is performed on two features in the feature set (that is, inner product is performed on two vectors) to obtain intersection features, the user attribute features and the object attribute features are spliced to obtain splicing features, and the fusion attribute features are obtained according to the intersection features and the splicing features. Obtaining the fusion attribute feature according to the cross feature and the splicing feature may be tiling each element of the cross feature and the splicing feature in sequence to form a target vector, and using the target vector as the fusion attribute feature; in addition, the (FM) concept of a factorizer is cross-referenced by performing the features on two features, namely:

<V_i,V_j>1≤i≤N,1≤j≤N

wherein the content of the first and second substances,<>represents the vector inner product, V_i,V_jAny two features in the feature set are represented, and N represents the number of the features in the feature set.

It can be understood that the fusion attribute feature contains the correlation information of the features among the user attribute feature interiors, the object attribute feature interiors and the user attribute feature and the object attribute feature, and the relationship among the features is obtained by performing explicit interaction among various features, so that the relationship between every two features can be combined in the fusion attribute feature, and the problem that certain negative influence is generated when the features are sparse in subsequent model training is solved.

Further optionally, after the user attribute feature and the object attribute feature are spliced to obtain a splicing feature, a first target attribute feature of the user attribute feature and/or a second target attribute feature of the object attribute feature are obtained, and the first target attribute feature and/or the second target attribute feature are subjected to product processing by using user attribute information in the splicing feature to obtain a processed splicing feature, so that a fusion attribute feature is obtained according to the cross feature and the processed splicing feature. The first target attribute feature and the second target attribute feature can be set by related business personnel according to actual conditions and experience values.

For example, as shown in fig. 5, fig. 5 is a schematic view of a scene for acquiring a fused attribute feature provided in an embodiment of the present application, where feature sets (Vn +1, Vn +2, V.,. V) including user attribute features (V1, V2,. and.. times, Vn) and object attribute features are paired with each other_N) Performing feature intersection on every two features to obtain an intersection feature, splicing the user attribute feature and the object attribute feature to obtain a splicing feature, obtaining a fusion attribute feature (such as fusion attribute feature 1) according to the intersection feature and the splicing feature, further, after obtaining the splicing feature, setting a first target attribute feature as V1 (set as [1,2,3,4,5,6 ])]) The first target attribute feature corresponds to the average click rate attribute of the sample user (attribute information is represented as 0.3), so in the splicing feature, the attribute information corresponding to the first target attribute feature is used for performing product processing on the first target attribute feature to obtain a processed splicing feature (namely, V₁：[1,2,3,4,5,6]*0.3→V′₁：[0.3,0.6,0.9,1.2,1.5,1.8]) And obtaining a fusion attribute feature (such as fusion attribute feature 2) according to the cross feature and the processed splicing feature.

In a possible implementation manner, the electronic device may specifically obtain the predicted click rate and the predicted conversion rate based on the fusion attribute feature by generating a click prediction feature and a conversion prediction feature based on the fusion attribute feature, obtaining the predicted click rate based on the click prediction feature, and obtaining the predicted conversion rate based on the conversion prediction feature. The electronic device generates the click prediction feature and the conversion prediction feature based on the fusion attribute feature, and specifically, the electronic device generates the layer by using the prediction feature and generates the click prediction feature and the conversion prediction feature based on the fusion attribute feature.

In some embodiments, the predicted feature generation layer may be a shared layer structure in a multitasking learning model (e.g., MMoE (Multi-gate mixed-of-Experts) model), and the predicted feature generation layer may include a click weight prediction network, a conversion weight prediction network, and N predicted feature generation networks; the electronic device may specifically generate a layer by using the prediction feature generation layer and generate the click prediction feature and the conversion prediction feature based on the fusion attribute feature and the N prediction features, generate an initial prediction feature corresponding to each prediction feature generation network, predict a first prediction weight of the generation network for each prediction feature based on the fusion attribute feature and the click weight prediction network, predict a second prediction weight of the generation network for each prediction feature based on the fusion attribute feature and the conversion weight prediction network, perform weighted summation on the initial prediction features corresponding to each prediction feature generation network by using the first prediction weights corresponding to each prediction feature generation network to obtain the click prediction feature, and perform weighted summation on the initial prediction features corresponding to each prediction feature generation network by using the second prediction weights corresponding to each prediction feature generation network, and obtaining transformation prediction characteristics.

The electronic device generates a network based on the fusion attribute features and the N prediction features, and the initial prediction feature corresponding to each prediction feature generation network may specifically be that the fusion attribute features are input into the N prediction feature generation networks, and corresponding initial prediction features are generated in each prediction feature generation network according to the fusion attribute features; and the electronic device predicting a first prediction weight for each prediction feature generation network based on the fusion attribute feature and the click weight prediction network and predicting a second prediction weight for each prediction feature generation network based on the fusion attribute feature and the transformation weight prediction network may specifically be inputting the fusion attribute feature into the click weight prediction network and generating a first prediction weight corresponding to each prediction feature generation network in the click weight prediction network, inputting the fusion attribute feature into the transformation weight prediction network and generating a second prediction weight corresponding to each prediction feature generation network in the transformation weight prediction network.

According to the above description, it can be expressed as:

first prediction weight g_ctr(x) Comprises the following steps: g_ctr(x)＝softmax(W_ctrx)

Second prediction weight g_cvr(x) Comprises the following steps: g_cvr(x)＝softmax(W_cvrx)

Click prediction feature f_ctr(x) Comprises the following steps:

transformation prediction feature f_cvr(x) Comprises the following steps:

wherein, W_ctrNetwork parameters, W, representing a click weight prediction network_cvrRepresenting network parameters of the conversion weight prediction network, x representing fusion attribute characteristics, g_ctr(x)_iA first prediction weight g representing the corresponding of the ith prediction feature generation network in the N prediction feature generation networks_cvr(x)_iA second prediction weight corresponding to the ith prediction feature generation network in the N prediction feature generation networks; f. of_i(x) And representing the initial predicted feature corresponding to the ith predicted feature generation network in the N predicted feature generation networks.

Optionally, the click weight prediction network, the conversion weight prediction network, and the N prediction feature generation networks may be all formed by one or more Fully Connected Layers (FCs), the click weight prediction network and the conversion weight prediction network may also be referred to as a threshold (gate) network, the N prediction feature generation networks may also be referred to as an Expert (Expert) network, and the model parameters of the initial prediction model include network parameters of the click weight prediction network, the conversion weight prediction network, and the N prediction feature generation networks; the number of N may be set by the associated service person based on empirical values.

In one possible implementation, the electronic device obtains the predicted click rate based on the click prediction feature, and obtains the predicted conversion rate based on the conversion prediction feature specifically may obtain the predicted click rate based on the click prediction feature and the predicted conversion rate based on the conversion prediction feature by using a prediction layer in the initial prediction model. The prediction layer may be an ESMM Model (Entire Space Multi-Task Model), and the prediction layer may include a click prediction network (CTR top) constructed for a click rate prediction Task and a conversion prediction network (CVR top) constructed for a conversion rate prediction Task, so that the predicted click rate may be obtained by using the click prediction network and based on click prediction characteristics, and the conversion click rate may be obtained by using the conversion prediction network and based on conversion prediction characteristics. Wherein the click prediction network and the conversion prediction network may each include one or more fully connected layers.

S303, generating a click loss function based on the click label and the predicted click rate, and generating a click conversion loss function based on the click label, the conversion label, the predicted click rate and the predicted conversion rate.

S304, a first weight function corresponding to the click loss function and a second weight function corresponding to the click conversion loss function are obtained. The specific implementation of steps S303 to S304 may refer to the related description of steps S203 to S204, which is not described herein again.

S305, determining a click gradient function according to the first weight function, the click loss function and the model parameters of the initial prediction model, and determining a click conversion gradient function according to the second weight function, the click conversion loss function and the model parameters of the initial prediction model.

In one possible embodiment, the initial prediction model further includes a weight adjustment layer, and the electronic device may determine a click gradient function according to the first weight function, the click loss function, and the model parameters of the initial prediction model and determine a click transition gradient function according to the second weight function, the click transition loss function, and the model parameters of the initial prediction model by using the weight adjustment layer, and determine the weight loss function, the click weight, and the click transition weight in the weight adjustment layer.

In some embodiments, the electronic device may determine the click gradient function using the weight adjustment layer and according to the first weight function, the click loss function, and the model parameters of the initial prediction model and determine the click conversion gradient function according to the second weight function, the click conversion loss function, and the model parameters of the initial prediction model may specifically be that the click gradient function is determined using the first weight function, the click loss function, the network parameters of the sharing layer in the model parameters, and the click conversion gradient function is determined using the second weight function, the click conversion loss function, and the network parameters of the sharing layer in the model parameters. Namely:

wherein G is^ctr _WRepresenting the click gradient function, G^ctcvr _WRepresenting the click transition gradient function, θ_shareA network parameter representing a shared layer; t may represent the t-th training process for the initial predictive model; l is_ctr(t) represents the click loss function, L_ctcvr(t) represents a click conversion loss function; w_ctr(t) represents a first weighting function; w_ctcvr(t) represents a second weighting function;

is obtained by calculationGradient of a with respect to B.

Optionally, the network parameter of the shared layer herein may refer to a network parameter in the N predicted feature generation networks, or may be another network parameter in the shared layer, which is not limited herein.

It can be understood that the weight adjusting layer implements Gradient Normalization to balance the gradients (or weights) of the two tasks of model training, so that the click weight of the click loss function and the click conversion weight of the click conversion loss function can be dynamically adjusted in the model training process by the weight adjusting layer to balance the coefficients of the two loss functions in the target loss function, thereby obtaining a better model training effect, where the click weight is a coefficient of the click loss function in the target loss function, and the click conversion weight is a coefficient of the click conversion loss function in the target loss function.

And S306, generating a weight loss function according to the click gradient function and the click transformation gradient function.

In one possible embodiment, the electronic device may generate, at the weight adjustment layer, a weight loss function according to the click gradient function and the click conversion gradient function, and specifically, may determine an average gradient function according to the first weight function, the second weight function, the click loss function, the click conversion loss function, and the model parameter, determine a target click gradient function according to the click gradient function and the average gradient function, determine a target click conversion gradient function according to the click conversion gradient function and the average gradient function, and generate the weight loss function according to the target click gradient function and the target click conversion gradient function. The model parameters used in the average gradient function may be network parameters of the click prediction network and network parameters of the conversion prediction network.

I.e. determining the mean gradient

The method specifically comprises the following steps:

wherein the content of the first and second substances,

representing the mean gradient function, theta_ctrNetwork parameter, θ, representing a click-to-predict network_cvrNetwork parameters representing a conversion prediction network; g¹ _W(t) representing a first gradient function determined from the first weight function, the click loss function, and the network parameters of the click prediction network; g² _W(t) represents a second gradient function determined from the second weight function, the click conversion loss function, and the network parameters of the conversion prediction network, so that the average gradient function is obtained by averaging the first gradient function and the second gradient function; e [ A + B ]]The method comprises the steps of (1) performing averaging processing on input A and B to obtain an average result (an average result) of the A and B; t may represent the t-th training process for the initial predictive model; l is_ctr(t) represents the click loss function, L_ctcvr(t) represents a click conversion loss function; w_ctr(t) represents a first weighting function; w_ctcvr(t) represents a second weighting function; w_ctr(t) represents a first weighting function; w_ctcvr(t) represents a second weighting function;

indicating that the gradient of a with respect to B is calculated.

In one possible embodiment, the processes and principles of determining the target click gradient function and determining the target click transition gradient function by the electronic device are the same, and the determination of the target click gradient function is taken as an example for description. The electronic device may specifically determine the target click gradient function according to the click gradient function and the average gradient function, where the target inverse training rate of the model training is determined according to the click loss function and the click conversion loss function, and the target click gradient function is determined according to the click gradient function, the average gradient function, and the target inverse training rate.

Wherein, the electronic device determines the target inverse training rate of the model training according to the click loss function and the click conversion loss function specifically comprises obtaining an initial click loss function and an initial click conversion loss function, determining the inverse training rate aiming at the click rate prediction task in the model training according to the initial click loss function and the click loss function, and determining the inverse training rate of the conversion rate prediction task in the model training according to the initial click conversion loss function and the click conversion loss function, determining an average inverse training rate based on the inverse training rate for the click-through rate prediction task and the inverse training rate for the conversion rate prediction task, and determining a relative inverse training rate aiming at the click rate prediction task according to the inverse training rate aiming at the click rate prediction task and the average inverse training rate, and taking the relative inverse training rate aiming at the click rate prediction task as a target inverse training rate. Optionally, when determining the target click conversion gradient function, the same average inverse training rate is used as that when determining the target click gradient function. Namely:

wherein the content of the first and second substances,

representing the inverse training rate for the click-through rate prediction task,

representing the inverse training rate, L, for the conversion prediction task_ctr(0) Expressing the initial click loss function, namely the initial click loss function in model training, L_ctcvr(0) The function of the initial click conversion loss is represented, namely the function of the initial click conversion loss in model training,

represents the average inverse training number, r_ctr(t) represents the relative inverse training rate for the click-through rate prediction task (i.e., the target inverse training rate in determining the target click gradient function), r_cvr(t) represents the relative inverse training rate for the conversion prediction task (i.e., the target inverse training rate when determining the target click conversion gradient function), Grad_ctr(t) represents the target click gradient function, Grad_ctcvr(t) represents a target click transformation gradient function, t can represent the t-th training process of the initial prediction model, and alpha is a hyper-parameter and can be set by related business personnel according to experience values.

In some embodiments, Grad is a function of target click gradient_ctr(t) and the target click transition gradient function Grad_ctcvr(t) generated weight loss function Grad (t)) It can be expressed as:

Grad(t)＝Grad_ctr(t)+Grad_ctcvr(t)

s307, generating a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function.

In one possible embodiment, the electronic device may adjust the first weight function based on the weight loss function to obtain an adjusted first weight function, and obtain the click weight (i.e., the adjusted click weight) by using the adjusted first weight function, and adjust the second weight function based on the weight loss function to obtain an adjusted second weight function, and obtain the click conversion weight (i.e., the adjusted click conversion weight) by using the adjusted second weight function.

In some embodiments, the electronic device adjusts the first weight function based on the weight loss function, and obtaining the adjusted first weight function may specifically be that the first weight function is derived based on the weight loss function to obtain a first weight adjustment function, and the adjusted first weight function is obtained according to the first weight adjustment function and the first weight function. The electronic device may determine the click weight of the round of model training according to the adjusted first weight function, that is, the first weight adjustment function and the first weight function. The second weight function is adjusted based on the weight loss function, and the adjusted second weight function may be obtained specifically by deriving the second weight function based on the weight loss function to obtain a second weight adjustment function, and obtaining the adjusted second weight function according to the second weight adjustment function and the second weight function. The electronic device may determine the click conversion weight of the round of model training according to the adjusted second weight function, that is, the second weight adjustment function and the second weight function. I.e. the adjusted first weight function W'_ctr(t)) may be:

W′_ctr(t)＝W_ctr(t)+λ_ctrβ_ctr(t)

wherein, W_ctr(t) denotes a first weighting function, β_ctr(t) represents the firstThe weight adjustment function, t, may represent the t-th training process for the initial prediction model, λ_ctrThe representation hyperparameter can be set by related service personnel according to experience values.

And, a second adjusted weight function W'_ctcvr(t) may be:

W′_ctcvr(t)＝W_ctcvr(t)+λ_ctcvrβ_ctcvr(t)

wherein, W_ctcvr(t) represents a second weight function, β_ctcvr(t) represents a second weight adjustment function, t may represent the t-th training process for the initial prediction model, λ_ctcvrRepresenting a hyper-parameter, which can be set by the relevant service personnel based on empirical values, lambda_ctrAnd λ_ctcvrMay be the same or different.

It can be understood that the adjusted first weight function and the adjusted second weight function of the model training of this round are used as the first weight function and the second weight function of the next round in the model training of the next round, and continuing to adjust the first weight function and the second weight function of the next round in the next round to obtain the adjusted first weight function and the adjusted second weight function of the next round, further obtaining the click weight (i.e. the click weight after the next round of adjustment) and the click conversion weight (i.e. the click conversion weight after the next round of adjustment), and determining the target loss function of the next round in the next round of model training based on the obtained click weight and click conversion weight of the next round, and correcting the model parameters of the next round by using the target loss function of the next round until the model converges after the multiple rounds of model training to obtain a final target prediction model.

S308, obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting model parameters of the initial prediction model based on the target loss function to obtain a target prediction model.

In a possible embodiment, the electronic device may obtain the target loss function based on the click weight, the click conversion weight, the click loss function, and the click conversion loss function, specifically, the click loss function is weighted according to the click weight to obtain a first weighted loss function, the click conversion loss function is weighted according to the click conversion weight to obtain a second weighted loss function, and the target loss function is generated according to the first weighted loss function and the second weighted loss function. The electronic device may modify a model parameter of the initial prediction model based on the target loss function to obtain the target prediction model. That is, only one round of model training is taken as an example for explanation here, and the process and principle of each round of model training are the same. And obtaining the target prediction model after the model converges. It can be understood that, in each round of model training process, not only the model parameters are corrected based on the target loss function, but also the target loss functions used for model training in each round may be the same or different, that is, when the target loss function is obtained in each round, the click weight (initial click weight) and the click conversion weight (initial click conversion weight) in the target loss function used in the last model training are dynamically adjusted by combining the determined weight loss function, so as to obtain the click weight and the click conversion weight in the target loss function used in the model training.

For example, please refer to fig. 6, fig. 6 is a schematic view of a scenario of model training provided in an embodiment of the present application, where an initial prediction model includes a sharing layer, a prediction layer, and a weight adjustment layer, the sharing layer includes a feature processing layer and a predicted feature generation layer, and the feature processing layer includes an attribute feature generation layer and a feature fusion layer; (1) the electronic equipment inputs the sample feature set into an initial prediction model, and generates user attribute features and object attribute features in an attribute feature generation layer; (2) generating fusion attribute characteristics in the characteristic fusion layer according to the user attribute characteristics and the object attribute characteristics; (3) generating a click prediction feature and a conversion prediction feature according to the fusion attribute feature at a prediction feature generation layer; (4) generating a predicted click rate by using a click prediction network according to click prediction characteristics, generating a predicted conversion rate by using a conversion prediction network according to conversion prediction characteristics, obtaining a click loss function according to the predicted click rate, and obtaining a click conversion loss function according to the predicted click rate and the predicted conversion rate in a prediction layer; (5) and obtaining the click weight and the click conversion weight at the weight adjusting layer according to the click loss function, the click conversion loss function and the model parameters of the initial prediction model, thereby obtaining a target loss function, and correcting the initial prediction model by using the target loss function to obtain a target prediction model for predicting the click rate and the conversion rate.

Through a large number of tests performed on the target prediction model, it is found that the prediction accuracy and efficiency for the click rate and the conversion rate are greatly improved compared with those of the prior art, that is, taking the application as a recommendation scene of a live broadcast product as an example, based on a recommendation task performed by the target prediction model, a test result shows that both the click rate and the conversion rate (i.e., the effective viewing time) are improved, as shown in the following table 1:

data set	Exposure method	Click on	Conversion (effective view)
				Training set	4248 million	1451 ten thousand	148 ten thousand
Test set	435 ten thousand	156 ten thousand	16 ten thousand

TABLE 1

The initial prediction model is trained by using the training set as a sample feature set to obtain a target prediction model, and the target prediction model is tested by using the test set, so that the exposure (pushing aiming at the online anchor) is about 435 ten thousand times, the click rate is about 156 ten thousand times, and the conversion rate is about 16 ten thousand times.

And by comparing various prediction models, the model evaluation index AUC (area under the curve) of the target prediction model provided by the technical scheme of the application is found to be greatly improved compared with the existing prediction model. See, table 2 below:

model (model)	CTR AUC	CVR AUC
			ESMM	0.7096	0.7427
ESMM+GradNorm	0.7094	0.7429
			FM+ESMM	0.7099	0.7432
FM+MMOE+ESMM	0.7248	0.7447
			FM+ESMM+GradNorm	0.7102	0.7433
FM+MMOE+ESMM+GradNorm	0.7473	0.7487

TABLE 2

The larger the CTR AUC and the CVR AUC are, the better the effect of the model on the predicted click rate and the predicted conversion rate is, namely the higher the accuracy is. From the above description, it can be known that the final model structure (FM + MMOE + ESMM + GradNorm) adopted in the technical solution of the present application is trained to obtain the target prediction model, the CTR AUC of the target prediction model can reach 0.7473, which is the highest compared with other models, and the CVR AUC of the target prediction model can reach 0.7487, which is the highest compared with other models.

In the embodiment of the application, a sample feature set can be obtained, the sample feature set is input into an initial prediction model, a predicted click rate and a predicted conversion rate of a sample user for a sample recommendation object are obtained based on the initial prediction model, a click loss function is generated based on a click label and the predicted click rate, a click conversion loss function is generated based on the click label, a conversion label, the predicted click rate and the predicted conversion rate, a first weight function corresponding to the click loss function and a second weight function corresponding to the click conversion loss function are obtained, the click gradient function is determined according to model parameters of the first weight function, the click loss function and the initial prediction model, the click conversion gradient function is determined according to the second weight function, the click conversion loss function and model parameters of the initial prediction model, and the weight loss function is generated according to the click gradient function and the click conversion gradient function, and generating a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function, obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting the model parameters of the initial prediction model based on the target loss function to obtain a target prediction model. By implementing the method provided by the embodiment of the application, the dynamic adjustment of the click weight and the click conversion weight can be realized by utilizing the gradient in the model training process so as to keep the balance of two loss functions in the target loss function, and further the training effect of the target prediction model is better, and further the accuracy of the target prediction model for predicting the click rate and the conversion rate is higher.

Referring to fig. 7, fig. 7 is a flowchart illustrating a data recommendation method based on a recommendation model according to an embodiment of the present application, where the method may be executed by the above-mentioned electronic device. As shown in fig. 7, a flow of the data recommendation method based on the recommendation model in the embodiment of the present application may include the following steps:

s701, acquiring target user attribute information of a predicted user and target object attribute information of an object to be recommended.

In a possible implementation manner, the trained target prediction model can predict the click rate and the conversion rate of the user for the object to be recommended, so that accurate pushing of the object to be recommended of the predicted user is realized. The recommendation scene of the live broadcast product is taken as an example for explanation, the prediction user is a user logging in an application client of the live broadcast product, the objects to be recommended are online anchor broadcasters, when the prediction user clicks a live broadcast interface, the electronic equipment acquires one or more online anchor broadcasters at present, and acquires target user attribute information of the prediction user and target object attribute information of each object to be recommended; the electronic device may be referred to as a background device of the application client.

S702, inputting the target user attribute information and the target object attribute information into a target prediction model.

In one possible implementation, the electronic device inputs the target user attribute information of the predicted user and the target object attribute information of the object to be recommended into the target prediction model. When a plurality of objects to be recommended exist, the target user attribute information of the predicted user and the target object attribute information of each object to be recommended are input into the target prediction model together.

The target prediction model may be obtained by training through the related description in the embodiment shown in fig. 2 and/or the embodiment shown in fig. 3.

And S703, generating a target prediction click rate and a target prediction conversion rate of the prediction user aiming at the object to be recommended in the target prediction model.

In some embodiments, the electronic device may generate a target predicted click rate and a target predicted conversion rate of the predicted user for each object to be recommended in the target prediction model respectively. It is understood that the target prediction model includes a shared layer and a prediction layer, the shared layer includes a feature processing layer and a predicted feature generation layer, and the feature processing layer includes an attribute feature generation layer and a feature fusion layer.

S704, obtaining the interest score of the prediction user for the object to be recommended based on the target prediction click rate and the target prediction conversion rate.

In some embodiments, click-through rate (P) is predicted based on a goal_ctr) And target predicted conversion (P)_cvr) The obtaining of the interest score (P) of the predicted user for the object to be recommended may be implemented by the following formula:

P＝P_ctr*P_cvr ^a

wherein a is a hyper-parameter which can be set by related service personnel according to experience values.

For example, a may be set to [0.0, 0.25, 0.5, 0.75, 1], and a percentage of the current group of flows to the full flow is adopted, and each group of flows is set to be 1% of flows, and tests are performed by using various prediction models and the target prediction model of the present application, so that the test effect of the target prediction model of the present application is the best, that is, the target prediction model is used as a recommendation scenario of a live broadcast product, see the following table 3:

TABLE 3

Each column represents a model test result for a traffic ratio obtained when a is a specified value, taking a ═ 0.5 as an example, the traffic ratio represents the traffic of the effective watching time length of a test user and the traffic consumed for pushing the test user in a specified time, and the larger the traffic ratio is, the better the model has effects on predicting click rate and conversion rate, that is, the higher the accuracy is. From the above description, it can be known that the last model structure (FM + MMOE + ESMM + GradNorm) used in the present application is trained to obtain the target prediction model, and the traffic ratio is the highest compared with other models when a is the same value.

S705, if the interest score is larger than the interest score threshold, pushing the object to be recommended to a user terminal corresponding to the predicted user.

In some embodiments, if the interest score of the predicted user for the object to be recommended is greater than the interest score threshold, the object to be recommended may be pushed to a user terminal of the predicted user object, that is, the object to be recommended is pushed to the predicted user in a live interface of the application client (online anchor). The interestedness score threshold may be set by the associated business person based on empirical values.

Optionally, when the interest scores corresponding to a plurality of objects to be recommended are greater than the interest score threshold, the plurality of objects to be recommended may be pushed randomly or sequentially according to the size of the interest scores; or when there are multiple objects to be recommended, pushing may be performed in sequence according to the sequence of the predicted interest scores of the user for the multiple objects to be recommended from large to small.

For example, as shown iN fig. 8 a-8 b, fig. 8 a-8 b are schematic diagrams of a push scenario for a predicted user according to an embodiment of the present application, where, as shown iN fig. 8a, when a predicted user clicks a live interface of an application client of a live product, an electronic device acquires a current online anchor, and combines the predicted user with each online anchor to form a pair to be predicted ((u, i1), (u, i2), (u, i3) · 9., (u, iN)), and extracts target user attribute information of the predicted user and target user attribute information of the online anchor from an attribute storage platform (which may be a database) based on a plurality of pairs to be predicted, and combines the target user attribute information of the predicted user with the target user attribute information of each online anchor to form an attribute pair ((u, i1), (u, i.. 2.), the target prediction model is sequentially input into each attribute pair, so that a target prediction click rate and a target prediction conversion rate of a prediction user for each attribute pair on the online anchor are obtained, the electronic device can obtain an interest score of the prediction user for each online anchor by using the target prediction click rate and the target prediction conversion rate, sort each online anchor by using the interest score of the prediction user for each online anchor to obtain the sorted online anchors, and sequentially push the sorted online anchors to user terminals of the prediction users, namely to a live broadcast interface of an application client, as shown in fig. 8 b.

According to the method and the device, target user attribute information of a prediction user and target object attribute information of an object to be recommended can be obtained, the target user attribute information and the target object attribute information are input into a target prediction model, a target prediction click rate and a target prediction conversion rate of the prediction user for the object to be recommended are generated in the target prediction model, an interest score of the prediction user for the object to be recommended is obtained based on the target prediction click rate and the target prediction conversion rate, and if the interest score is larger than an interest score threshold value, the object to be recommended is pushed to a user terminal corresponding to the prediction user. By implementing the method provided by the embodiment of the application, the interest score of the prediction user for the object to be recommended can be obtained by using the target prediction model, and the object to be recommended is pushed based on the interest score, so that accurate pushing in a recommendation scene is realized.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a recommended model training apparatus provided in the present application. It should be noted that, the recommended model training apparatus shown in fig. 9 is used for executing the method of the embodiment shown in fig. 2 and fig. 3 of the present application, and for convenience of description, only the portion related to the embodiment of the present application is shown, and details of the technology are not disclosed, please refer to the embodiment shown in fig. 2 and fig. 3 of the present application. The recommendation model training apparatus 900 may include: an acquisition module 901, a generation module 902 and a correction module 903. Wherein:

an obtaining module 901, configured to obtain a sample feature set; the sample feature set comprises user attribute information of a sample user, object attribute information of a sample recommended object, a click label and a conversion label of the sample user for the sample recommended object;

the obtaining module 901 is further configured to input the sample feature set into an initial prediction model, and obtain a predicted click rate and a predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model;

a generating module 902, configured to generate a click loss function based on the click label and the predicted click rate, and generate a click conversion loss function based on the conversion label, the predicted click rate, and the predicted conversion rate;

the generating module 902 is further configured to obtain a weight loss function according to the click loss function, the click conversion loss function, and the model parameters of the initial prediction model, and generate a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function;

and the correcting module 903 is used for obtaining a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and correcting the model parameters of the initial prediction model based on the target loss function to obtain a target prediction model.

In a possible embodiment, the generating module 902, when configured to obtain the weight loss function according to the click loss function, the click conversion loss function, and the model parameters of the initial prediction model, is specifically configured to:

acquiring a first weight function corresponding to the click loss function and a second weight function corresponding to the click conversion loss function;

determining a click gradient function according to the first weight function, the click loss function and the model parameters of the initial prediction model;

determining a click conversion gradient function according to the second weight function, the click conversion loss function and the model parameters of the initial prediction model;

and generating a weight loss function according to the click gradient function and the click conversion gradient function.

In a possible implementation, the generating module 902, when configured to generate the weight loss function according to the click gradient function and the click transformation gradient function, is specifically configured to:

determining an average gradient function according to the first weight function, the second weight function, the click loss function, the click conversion loss function and the model parameter;

determining a target click gradient function according to the click gradient function and the average gradient function;

determining a target click conversion gradient function according to the click conversion gradient function and the average gradient function;

and generating a weight loss function according to the target click gradient function and the target click conversion gradient function.

In a possible embodiment, the generating module 902, when configured to generate the click weight corresponding to the click loss function and the click conversion weight corresponding to the click conversion loss function based on the weight loss function, is specifically configured to:

the first weight function is subjected to derivation based on the weight loss function to obtain a first weight adjusting function, and the click weight is determined according to the first weight adjusting function and the first weight function;

and obtaining a second weight adjusting function by carrying out derivation on the second weight function based on the weight loss function, and determining the click conversion weight according to the second weight adjusting function and the second weight function.

In a possible embodiment, the modification module 903, when configured to obtain the target loss function based on the click weight, the click conversion weight, the click loss function, and the click conversion loss function, is specifically configured to:

weighting the click loss function according to the click weight to obtain a first weighted loss function;

weighting the click conversion loss function according to the click conversion weight to obtain a second weighted loss function;

a target loss function is generated based on the first weighted loss function and the second weighted loss function.

In a possible embodiment, the obtaining module 901, when configured to obtain the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model, is specifically configured to:

generating user attribute characteristics corresponding to the user attribute information and generating object attribute characteristics corresponding to the object attribute information in the initial prediction model;

and obtaining a predicted click rate and a predicted conversion rate based on the user attribute characteristics and the object attribute characteristics.

In a possible embodiment, the obtaining module 901, when configured to obtain the predicted click rate and the predicted conversion rate based on the user attribute feature and the object attribute feature, is specifically configured to:

performing feature fusion on the user attribute features and the object attribute features to obtain fusion attribute features;

and obtaining the predicted click rate and the predicted conversion rate based on the fusion attribute characteristics.

In a possible embodiment, the obtaining module 901, when configured to obtain the predicted click rate and the predicted conversion rate based on the fusion attribute feature, is specifically configured to:

generating a click prediction feature and a conversion prediction feature based on the fusion attribute feature;

and obtaining a predicted click rate based on the click prediction characteristic and obtaining a predicted conversion rate based on the conversion prediction characteristic.

In a possible embodiment, the initial prediction model includes a click weight prediction network, a conversion weight prediction network, and N prediction feature generation networks, where N is a positive integer;

the obtaining module 901 is specifically configured to, when being configured to generate the click prediction feature and the conversion prediction feature based on the fusion attribute feature:

inputting the fusion attribute features into N prediction feature generation networks, and respectively generating corresponding initial prediction features in each prediction feature generation network according to the fusion attribute features;

inputting the fusion attribute characteristics into a click weight prediction network, and generating a first prediction weight corresponding to each prediction characteristic generation network in the click weight prediction network;

inputting the fusion attribute characteristics into a conversion weight prediction network, and generating a second prediction weight corresponding to each prediction characteristic generation network in the conversion weight prediction network;

respectively carrying out weighted summation on the initial prediction features corresponding to each prediction feature generation network by using the first prediction weight corresponding to each prediction feature generation network to obtain click prediction features;

and respectively carrying out weighted summation on the initial prediction features corresponding to each prediction feature generation network by using the second prediction weight corresponding to each prediction feature generation network to obtain the conversion prediction features.

In the embodiment of the application, an obtaining module inputs a sample feature set into an initial prediction model, and obtains a predicted click rate and a predicted conversion rate of a sample user for a sample recommendation object based on the initial prediction model; the generation module generates a click loss function based on the click label and the predicted click rate, and generates a click conversion loss function based on the conversion label, the predicted click rate and the predicted conversion rate; the generating module obtains a weight loss function according to the click loss function, the click conversion loss function and the model parameters of the initial prediction model, and generates a click weight corresponding to the click loss function and a click conversion weight corresponding to the click conversion loss function based on the weight loss function; the correction module obtains a target loss function based on the click weight, the click conversion weight, the click loss function and the click conversion loss function, and corrects model parameters of the initial prediction model based on the target loss function to obtain a target prediction model. By implementing the device, the click weight and the click conversion weight can be dynamically adjusted in the model training process so as to keep the balance of two loss functions in the target loss function, and further the training effect of the target prediction model is better, and the accuracy of the prediction click rate and the conversion rate is higher.

Each functional module in the embodiments of the present application may be integrated into one module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module may be implemented in a form of hardware, or may be implemented in a form of software functional module, which is not limited in this application.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a data recommendation device based on a recommendation model according to the present application. It should be noted that, the data recommendation apparatus based on the recommendation model shown in fig. 10 is used for executing the method of the embodiment shown in fig. 7 of the present application, and for convenience of description, only the portion related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to the embodiment shown in fig. 7 of the present application. The recommendation model-based data recommendation apparatus 1000 may include: the device comprises an acquisition module 1001, an input module 1002, a generation module 1003 and a push module 1004. Wherein:

an obtaining module 1001, configured to obtain target user attribute information of a predicted user and target object attribute information of an object to be recommended;

an input module 1002, configured to input the target user attribute information and the target object attribute information into the target prediction model;

the generating module 1003 is configured to generate a target prediction click rate and a target prediction conversion rate of the prediction user for the object to be recommended in the target prediction model;

the obtaining module 1003 is further configured to obtain an interest score of the prediction user for the object to be recommended based on the target prediction click rate and the target prediction conversion rate;

the pushing module 1004 is configured to, if the interest score is greater than the interest score threshold, push the object to be recommended to the user terminal corresponding to the predicted user.

In one possible implementation, the target prediction model may be trained using the related descriptions in the embodiment shown in fig. 2 and/or the embodiment shown in fig. 3.

In the embodiment of the application, an acquisition module acquires target user attribute information of a predicted user and target object attribute information of an object to be recommended; the input module inputs the target user attribute information and the target object attribute information into a target prediction model; the generation module generates a target prediction click rate and a target prediction conversion rate of a prediction user for an object to be recommended in a target prediction model; the obtaining module obtains the interest scores of the prediction users for the objects to be recommended based on the target prediction click rate and the target prediction conversion rate; and if the interest score is larger than the interest score threshold value, the pushing module pushes the object to be recommended to a user terminal corresponding to the predicted user. By implementing the device, the interest scores of the prediction user for the objects to be recommended can be obtained by utilizing the target prediction model, and the objects to be recommended are pushed based on the interest scores, so that accurate pushing in a recommendation scene is realized.

Referring to fig. 11, fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 11, the electronic device 1100 includes: at least one processor 1101, a memory 1102. Optionally, the electronic device may further include a network interface. Wherein data can be exchanged between the processor 1101, the memory 1102 and a network interface, the network interface is controlled by the processor 1101 for transceiving messages, the memory 1102 is used for storing a computer program, the computer program comprises program instructions, and the processor 1101 is used for executing the program instructions stored in the memory 1102. Wherein the processor 1101 is configured to call the program instructions to perform the above method.

Memory 1102 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 1102 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid-state drive (SSD), etc.; memory 1102 may also comprise a combination of memories of the type described above.

The processor 1101 may be a Central Processing Unit (CPU). In one embodiment, processor 1101 may also be a Graphics Processing Unit (GPU). The processor 1101 may also be a combination of a CPU and a GPU.

In one possible embodiment, the memory 1102 is used to store program instructions that the processor 1101 may call to perform the following steps:

generating a click loss function based on the click label and the predicted click rate, and generating a click conversion loss function based on the click label, the conversion label, the predicted click rate and the predicted conversion rate;

In one possible embodiment, the processor 1101, when being configured to obtain the weight loss function according to the click loss function, the click conversion loss function and the model parameters of the initial prediction model, is specifically configured to:

In one possible embodiment, the processor 1101, when being configured to generate the weight loss function according to the click gradient function and the click translation gradient function, is specifically configured to:

In one possible embodiment, when the processor 1101 is configured to generate a click weight corresponding to a click loss function and a click conversion weight corresponding to the click conversion loss function based on a weight loss function, specifically:

In one possible embodiment, the processor 1101, when being configured to obtain the target loss function based on the click weight, the click conversion weight, the click loss function, and the click conversion loss function, is specifically configured to:

In one possible embodiment, the processor 1101, when configured to obtain the predicted click rate and the predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model, is specifically configured to:

In one possible embodiment, the processor 1101, when being configured to obtain the predicted click rate and the predicted conversion rate based on the user attribute feature and the object attribute feature, is specifically configured to:

In one possible embodiment, the processor 1101, when being configured to obtain the predicted click rate and the predicted conversion rate based on the fused attribute feature, is specifically configured to:

when the processor 1101 is configured to generate the click prediction feature and the conversion prediction feature based on the fusion attribute feature, specifically:

The target prediction model is obtained by training using the related description in the embodiment shown in fig. 2 and/or the embodiment shown in fig. 3.

In a specific implementation, the above-described apparatus, processor 1101, memory 1102, and the like may perform the implementation described in the above method embodiment, and may also perform the implementation described in this embodiment, which is not described herein again.

Also provided in embodiments of the present application is a computer (readable) storage medium storing a computer program, where the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to perform some or all of the steps performed in the above-mentioned method embodiments. Alternatively, the computer storage media may be volatile or nonvolatile. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

Reference herein to "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, and the aforementioned program can be stored in a computer storage medium, which can be a computer-readable storage medium, and when executed, the program can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

While the present disclosure has been described with reference to particular embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present disclosure.

Claims

1. A recommendation model training method, the method comprising:

inputting the sample feature set into an initial prediction model, and obtaining a predicted click rate and a predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model;

2. The method of claim 1, wherein deriving a weight loss function from the click loss function, the click conversion loss function, and model parameters of the initial prediction model comprises:

determining a click gradient function according to the first weight function, the click loss function and model parameters of the initial prediction model;

and generating the weight loss function according to the click gradient function and the click conversion gradient function.

3. The method of claim 2, wherein the generating the weight loss function from the click gradient function and the click conversion gradient function comprises:

and generating the weight loss function according to the target click gradient function and the target click conversion gradient function.

4. The method according to claim 2, wherein the generating the click weight corresponding to the click loss function and the click conversion weight corresponding to the click conversion loss function based on the weight loss function comprises:

the first weight function is derived based on the weight loss function to obtain a first weight adjusting function, and the click weight is determined according to the first weight adjusting function and the first weight function;

and deriving the second weight function based on the weight loss function to obtain a second weight adjusting function, and determining the click conversion weight according to the second weight adjusting function and the second weight function.

5. The method of claim 1, wherein deriving a target loss function based on the click weight, the click conversion weight, the click loss function, and the click conversion loss function comprises:

and generating the target loss function according to the first weighted loss function and the second weighted loss function.

6. The method of claim 1, wherein obtaining the predicted click-through rate and the predicted conversion rate of the sample user for the sample recommendation object based on the initial prediction model comprises:

generating a user attribute feature corresponding to the user attribute information and generating an object attribute feature corresponding to the object attribute information in the initial prediction model;

and acquiring the predicted click rate and the predicted conversion rate based on the user attribute characteristics and the object attribute characteristics.

7. The method of claim 6, wherein obtaining the predicted click rate and the predicted conversion rate based on the user attribute features and the object attribute features comprises:

and acquiring the predicted click rate and the predicted conversion rate based on the fusion attribute characteristics.

8. The method of claim 7, wherein obtaining the predicted click rate and the predicted conversion rate based on the fused attribute features comprises:

and obtaining the predicted click rate based on the click prediction characteristic, and obtaining the predicted conversion rate based on the conversion prediction characteristic.

9. The method of claim 8, wherein the initial prediction model comprises a click weight prediction network, a conversion weight prediction network, and N predicted feature generation networks, N being a positive integer;

generating a click prediction feature and a conversion prediction feature based on the fused attribute feature includes:

inputting the fusion attribute features into the N prediction feature generation networks, and respectively generating corresponding initial prediction features in each prediction feature generation network according to the fusion attribute features;

inputting the fusion attribute characteristics into the click weight prediction network, and generating a first prediction weight corresponding to each prediction characteristic generation network in the click weight prediction network;

inputting the fusion attribute features into the conversion weight prediction network, and generating a second prediction weight corresponding to each prediction feature generation network in the conversion weight prediction network;

respectively carrying out weighted summation on the initial prediction features corresponding to each prediction feature generation network by using the first prediction weight corresponding to each prediction feature generation network to obtain the click prediction features;

10. A data recommendation method based on a recommendation model is characterized by comprising the following steps:

inputting the target user attribute information and the target object attribute information into a target prediction model; the target prediction model is obtained by training by adopting the method of any one of the claims 1 to 9;

generating a target prediction click rate and a target prediction conversion rate of the prediction user for the object to be recommended in the target prediction model;

obtaining the interest score of the prediction user for the object to be recommended based on the target prediction click rate and the target prediction conversion rate;

and if the interest score is larger than the interest score threshold, pushing the object to be recommended to a user terminal corresponding to the prediction user.

11. An electronic device comprising a processor and a memory, wherein the memory is configured to store a computer program comprising program instructions, and wherein the processor is configured to invoke the program instructions to perform the method of any of claims 1-10.

12. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-10.