CN115965463A

CN115965463A - Model training method and device, computer equipment and storage medium

Info

Publication number: CN115965463A
Application number: CN202211678130.1A
Authority: CN
Inventors: 苏婵菲; 孔涛涛; 林鹏
Original assignee: Shenzhen Hefei Technology Co ltd
Current assignee: Shenzhen Hefei Technology Co ltd
Priority date: 2022-12-26
Filing date: 2022-12-26
Publication date: 2023-04-14

Abstract

The application discloses a model training method, a model training device, computer equipment and a storage medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring a training sample set; inputting the characteristic data of each sample user into the first initial model to obtain a prediction label corresponding to each behavior prediction task; determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task; determining a total loss value based on a first loss value corresponding to each behavior prediction task; and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model. Therefore, the prediction capability of the model to the multiple behavior prediction tasks is trained by adopting multi-task learning, information sharing among the multiple behavior prediction tasks is realized, and the prediction accuracy of the model to the behaviors of the user is improved.

Description

Model training method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a model training method and apparatus, a computer device, and a storage medium.

Background

In many marketing scenarios, due to the limitation of marketing cost budget, head high-value users are generally required to be screened from the whole users for marketing, and how to accurately select the high-value users becomes an important challenge for modeling in the marketing scenarios.

Disclosure of Invention

The application provides a model training method, a model training device, computer equipment and a storage medium, so as to improve the prediction accuracy of a model.

In a first aspect, an embodiment of the present application provides a model training method, where the method includes: acquiring a training sample set, wherein each training sample in the training sample set carries a plurality of target labels, the target labels correspond to a plurality of behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user completes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data is extracted from a target credit platform; inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user; determining a first loss value corresponding to each behavior prediction task according to the difference degree between a prediction label corresponding to each behavior prediction task and a target label corresponding to each behavior prediction task; determining a total loss value based on a first loss value corresponding to each behavior prediction task; and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

In a second aspect, an embodiment of the present application provides a model training apparatus, including: the system comprises a training sample obtaining module, a label predicting module, a first loss value determining module, a total loss value determining module and a model training module. A training sample acquisition module, configured to acquire a training sample set, where each training sample in the training sample set carries multiple target labels, the multiple target labels are in one-to-one correspondence with multiple behavior prediction tasks, each training sample includes sample user feature data of a sample user, and the target label is used to represent whether the sample user completes a target behavior corresponding to the behavior prediction task; the label prediction module is used for inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, and the prediction label is used for representing the probability value of the sample user completing the target behavior corresponding to the behavior prediction task; the first loss value determining module is used for determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction tag corresponding to each behavior prediction task and the target tag corresponding to each behavior prediction task; a total loss value determining module, configured to determine a total loss value based on a first loss value corresponding to each behavior prediction task; and the model training module is used for carrying out iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

In a third aspect, an embodiment of the present application provides a computer device, including: one or more processors; a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.

In a fourth aspect, the present application provides a computer-readable storage medium, in which program codes are stored, and the program codes can be called by a processor to execute the method described above.

According to the scheme, a training sample set is obtained, each training sample in the training sample set carries a plurality of target labels, the plurality of target labels correspond to a plurality of behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user finishes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data are extracted from a target credit platform; inputting the characteristic data of each sample user into the first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user; determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task; determining a total loss value based on a first loss value corresponding to each behavior prediction task; and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model. In this way, a multi-task learning framework is adopted, the prediction capability of the model to a plurality of behavior prediction tasks is trained, a total loss value is determined based on the loss value of each behavior prediction task during prediction, and iterative training of the model is performed based on the total loss value; the method and the device have the advantages that information sharing and mutual complementation among a plurality of behavior prediction tasks are realized, the prediction performance of each behavior prediction task can be improved, the prediction capability of the model on the behaviors of the user is comprehensively improved, the quality of the user screened based on the target marketing model is higher, the high-quality user in the target credit platform can be accurately circled out from a user group to push credit products, the waste of marketing budget is avoided, and meanwhile, the product pushing effect can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 shows a schematic diagram of a credit scenario funnel-shaped conversion link provided by an embodiment of the application.

Fig. 2 is a schematic flowchart illustrating a model training method according to an embodiment of the present application.

Fig. 3 shows a schematic flow diagram of the substeps of step S110 in fig. 2 in one embodiment.

Fig. 4 shows a schematic flow chart of the sub-steps of step S120 in fig. 2 in one embodiment.

Fig. 5 shows a model architecture diagram of a first initial model provided by an embodiment of the present application.

Fig. 6 shows a network structure diagram of a behavior prediction module according to an embodiment of the present application.

Fig. 7 is a flowchart illustrating a model training method according to another embodiment of the present application.

Fig. 8 is a flowchart illustrating a model training method according to another embodiment of the present application.

Fig. 9 shows a flow diagram of the sub-steps of step S370 in fig. 8 in one embodiment.

FIG. 10 is a block diagram of a model training apparatus according to an embodiment of the present application.

FIG. 11 is a block diagram of a computer apparatus for performing a model training method according to an embodiment of the present application.

FIG. 12 is a memory unit for storing or carrying program code for implementing a model training method according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that in some of the flows described in the specification, claims and drawings of the present application, a number of operations are included which occur in a particular order, and these operations may be performed out of order or in parallel as they occur herein. The sequence numbers of the operations, such as S110, S120, etc., are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any execution order. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. Also, the terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprise," "include," and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or sub-modules is not necessarily limited to those steps or sub-modules expressly listed, but may include other steps or sub-modules not expressly listed or inherent to such process, method, article, or apparatus.

In the related art, as shown in fig. 1, the credit loan in the financial marketing scene is usually converted by a user through a process of exposure, click, real name, credit application and loan application, the number of people who finally apply for a loan link is usually small, and the model accuracy is not high due to direct modeling for the credit application link, so that the value of the user screened out from the whole user based on the model is low, the marketing push effect of credit products is influenced, and the marketing cost is wasted.

The inventor provides a model training method, a model training device, computer equipment and a storage medium, and the model training is carried out in a multi-task learning mode, so that the end-to-end conversion rate and the prediction accuracy of the model are improved. The following describes in detail the model training method provided in the embodiments of the present application.

Referring to fig. 2, fig. 2 is a schematic flow chart of a model training method according to an embodiment of the present application. The model training method provided by the embodiment of the present application will be described in detail below with reference to fig. 2. The model training method may include the steps of:

step S110: the method comprises the steps of obtaining a training sample set, wherein each training sample in the training sample set carries a plurality of target labels, the target labels correspond to behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user completes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data are extracted from a target credit platform.

In this embodiment, the training sample set may be obtained from an open-source feature data sample set, or may be automatically constructed based on a large number of crawled sample data sets, which is not limited in this embodiment. Each training sample comprises sample user characteristic data of one sample user, and the sample user characteristic data at least comprises user attribute characteristic data and user historical behavior characteristic data. Each training sample carries a plurality of target labels, each target label corresponds to one behavior prediction task, and each target label is used for representing whether a sample user completes the target behavior corresponding to the behavior prediction task corresponding to the target label; alternatively, the target label may be a numerical value, for example, a numerical value 1 is used as a label of the target behavior corresponding to the behavior completion prediction task, and a numerical value 0 is used as a label of the target behavior corresponding to the behavior completion prediction task, and of course, other numerical values may also be used as the target label, which is not limited in this embodiment.

Optionally, the behavior prediction tasks may include multiple behavior prediction tasks such as "send success- > click", "send success- > real name", "send success- > credit" and "send success- > debit", where "send success- > click" may be understood as a behavior that a sample user clicks a product link included in related pushed information after the related pushed information of a credit product for the sample user is pushed successfully; the "successful sending- > real name" can be understood as the behavior that the sample user enters the loan APP to complete real name registration after clicking the product link; "successfully send- > trust" can be understood as the behavior of successful trust after the sample user completes the real-name registration; "send success- > borrowing" can be understood as the behavior of completing borrowing after the sample user successfully grants credit.

In some embodiments, referring to fig. 3, step S110 may include the following steps S111 to S114:

step S111: the method comprises the steps of obtaining a sample data set from a target credit platform, wherein the sample data set comprises user attribute data and historical behavior data of each sample user in a sample user group.

In this embodiment, the sample data set includes sample data corresponding to each sample user, where the sample data corresponding to each sample user includes user attribute data and historical behavior data of the sample user; the user attribute data may include gender information, age information, academic information, payroll income information, house loan information, car loan information, family member information, and the like, and it is understood that, for a registered sample user, the user attribute data may be obtained from the registration information; for the unregistered sample user, the user attribute data can be obtained through model estimation. The historical behavior data may include behavior data related to loan behaviors of the user in a historical preset time period of the credit platform, for example, behavior data such as usage of the user in a loan Application (APP), real name, credit Application, credit passing, and debit Application. The target credit platform may be various types of loan APPs, or may be a loan website, or may be other financial APPs or other financial websites, which is not limited in this embodiment.

Step S112: and extracting the user attribute data of each sample user and the characteristic data of the historical behavior data to obtain the sample user characteristic data corresponding to each sample user.

Further, after the user attribute data and the historical behavior data corresponding to each sample user are obtained, feature extraction can be performed on the user attribute information and the historical behavior data of each sample user through a preset feature extraction algorithm or a pre-trained feature extraction model, and then the sample user feature data corresponding to each sample user is obtained.

Step S113: and determining whether each sample user completes the target behavior corresponding to each behavior prediction task within a target time length according to the historical behavior data of each sample user, so as to obtain a plurality of target labels corresponding to each sample user.

In this manner, the target label may be used to ensure whether the sample user completes the target behavior corresponding to the behavior prediction task within the target duration, where the target duration is a preset duration value, for example, 2 days or 3 days, and this embodiment is not limited thereto. Therefore, the historical behavior data of each sample user can be analyzed, whether each sample user completes the target behavior corresponding to each behavior prediction task within the target duration is determined, and a plurality of target labels corresponding to each sample user are obtained. That is to say, if the target behavior corresponding to each behavior prediction task is completed within the target duration by the sample user after analysis, the sample user characteristic data of the sample user is a positive sample of each behavior prediction task, and if the target behavior corresponding to each behavior prediction task is not completed within the target duration by the sample user, the sample user characteristic data of the sample user is a negative sample of each behavior prediction task.

It can be understood that the target time lengths corresponding to different behavior prediction tasks may be the same or different, for example, the target time length corresponding to "send success- > click" may be 1 day, the target time length corresponding to "send success- > real name" may be 2 days, the target time length corresponding to "send success- > credit" may be 4 days, and the target time length corresponding to "send success- > debit" may be 7 days.

Step S114: and adding a plurality of target labels corresponding to each sample user to the sample user characteristic data corresponding to each sample user to obtain the training sample set.

Finally, after obtaining a plurality of target labels corresponding to each sample user, a plurality of target labels corresponding to each sample user may be added to the sample user feature data corresponding to each sample user, so as to obtain a training sample set.

Step S120: inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user.

In some embodiments, the first initial model includes a plurality of goal task prediction modules, and each behavior prediction task is associated with at least one goal task prediction module, referring to fig. 4, step S120 may include the following steps S121 to S122:

step S121: and respectively inputting each sample user characteristic data into each target task prediction module to obtain a first probability value output by each target task prediction module.

In this manner, for convenience of description, the first initial model will be described with reference to the model structure of fig. 5, and the first initial model may include a feature sharing embedding module, a plurality of target task prediction modules, and a loss function corresponding to each target task prediction module, where the plurality of target task prediction modules may include a "send success- > click" task module, a "click- > real name" task module, a "real name- > credit" task module, and a "credit- > debit" task module.

Based on this, the feature data of each sample user can be input into the feature sharing and embedding module, the feature sharing and embedding module adopts a field embedding (field embedding) method, and the embedding features obtained after passing through the feature sharing and embedding module can be expressed by the following formula:

e _fe ＝[x ₁ ·e ₁ ,x ₂ ·e ₂ ,...,x _N ·e _N ]

wherein e _i Indicating the embedding of the ith feature, x _i Value representing the ith feature, e _fe Representing the concatenation of all features and finally the output of the feature sharing embedded module.

Further, the embedded features output by the feature sharing and embedding module are respectively input into each target task prediction module, so that a first probability value output by each target task prediction module can be obtained. As shown in fig. 5, y1 is a first probability value output by the "send success- > click" task module, y1 may also be understood as a predicted click probability, y2 is a first probability value output by the "click- > real name" task module, y2 may also be understood as a transition probability of click to real name, y3 is a first probability value output by the "real name- > credit" task module, y3 may also be understood as a transition probability of real name to credit, y4 is a first probability value output by the "credit- > debit" task module, and y4 may also be understood as a transition probability of credit to debit.

In some embodiments, the network architecture diagram of each target task prediction module may be as shown in fig. 6, and each target task prediction module may include 4 fully connected layers (FCs), 3 ReLU active layers, and 1 Sigmoid active layer; the first fully-connected layer receives feature embedding as input, the dimensions are 16 × N,16 is the dimension of each feature embedding, N is the number of features, the subsequent fully-connected layers are subjected to dimensionality reduction, each fully-connected layer is followed by an active layer, the last fully-connected layer is followed by a Sigmoid active layer, and a value between 0 and 1 is output, namely the first mentioned probability value is output.

Step S122: and acquiring the product of first probability values output by the target task prediction module associated with each behavior prediction task, and using the product as a prediction label corresponding to each behavior prediction task.

In this embodiment, if the number of target task prediction modules associated with a behavior prediction task is 1, a first probability value output by the associated target task prediction module is obtained as a prediction tag corresponding to the behavior prediction task. In a credit scene, a link from successful sending to borrowing is a strict conversion link, namely, a user sending the borrowing can certainly give credit, real name and click, namely, the probability from exposure to a link is the probability from exposure to the previous link multiplied by the probability from the link to the current link. Therefore, if the number of the target task prediction modules associated with the behavior prediction task is multiple, the product of the first probability values output by the multiple associated target task prediction modules is obtained as the prediction label corresponding to the behavior prediction task.

Taking fig. 5 as an example, the number of the target task prediction modules associated with "send success- > click" is only 1, that is, the "send success- > click" task module, at this time, the first probability value y1 output by the "send success- > click" task module is taken as the prediction label p1, that is, p1= y1.

Optionally, there are 2 target task prediction modules associated with "send success- > real name", that is, the "send success- > click" task module and the "click- > real name" task module, at this time, a product of a first probability value y1 output by the "send success- > click" task module and a first probability value y2 output by the "click- > real name" task module is obtained as the prediction label p2, that is, p2= y1 y2.

Optionally, there are 3 target task prediction modules associated with "send success- > trust", that is, the "send success- > click" task module, the "click- > real name" task module, and the "real name- > trust" task module, and at this time, a product of the first probability values y1, y2, and y3 is obtained as the prediction label p3, that is, p3= y1 y2 y3.

Optionally, there are 4 target task prediction modules associated with "send success- > borrow", that is, a "send success- > click" task module, a "click- > real name" task module, a "real name- > credit" task module, and a "credit- > borrow" task module, and at this time, a product of the first probability values y1, y2, y3, and y4 is obtained as the prediction label p4, that is, p4= y1 y2 y3 y4.

Step S130: and determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task.

Step S140: and determining a total loss value based on the first loss value corresponding to each behavior prediction task.

Further, after the prediction label corresponding to each behavior prediction task corresponding to each sample user is obtained, a first loss value corresponding to each behavior prediction task may be determined according to a difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task, that is, the loss of a certain block of each behavior prediction task is obtained.

In some embodiments, the first loss values corresponding to each behavior prediction task are weighted and summed to obtain a total loss value. It is understood that, as the total loss value, the sum of the first loss values corresponding to each behavior prediction task may be directly obtained, and taking the loss value in fig. 5 as an example, the total loss value = l1+ l2+ l3+ l4.

Of course, the total loss value may also be obtained by setting weighting coefficients k1, k2, k3, and k4 in advance for each behavior prediction task and performing weighted summation on the first loss values corresponding to each behavior prediction task. Taking the loss values in fig. 5 as an example, the total loss value = k1 × l1+ k2 × l2+ k3 × l3+ k4 × l4.

Step S150: and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

Finally, after a total loss value calculated based on a first loss value corresponding to each behavior prediction task is obtained, a back propagation algorithm can be executed on a network of the first initial model, parameters in the first initial model are updated according to the total loss value, simultaneously, an Adam optimization function can be used, the batch size is determined to be 1024, the initial learning rate is 0.02, half-reduction attenuation is carried out on the learning rate in every two rounds, and therefore the problems that the first initial model is fast in convergence speed, crosses over or ignores the minimum value, the loss function cannot converge to the minimum value due to continuous oscillation and the like are avoided, namely the first initial model is guaranteed to converge in time. And then, iteratively updating the parameters in the first initial model based on the total loss value until a first target condition is met, so as to obtain a target marketing model, and storing the structure and the parameters of the target marketing model.

Wherein the first target condition may be: the total loss value is smaller than a preset value, the total loss value does not change any more, or the training times reach preset times, and the like. It can be understood that after the iterative training of a plurality of training cycles is performed on the first initial model according to the first user feature data set, wherein each training cycle includes a plurality of iterative training, parameters in the first initial model are continuously optimized, so that the total loss value is smaller and smaller, and finally, the total loss value is reduced to a fixed value or is smaller than the preset value, and at this time, it indicates that the first initial model has converged; of course, it may also be determined that the first initial model has converged after the training times reach the preset times, and at this time, the converged first initial model may be used as the target marketing model. The preset value and the preset times are preset, and the value of the preset value and the preset times can be adjusted according to different application scenarios, which is not limited in this embodiment.

It can be understood that after the training of the target marketing model is completed, the probability prediction capability of the target marketing model can be utilized to predict the probability value of the target behavior corresponding to each behavior prediction task completed by each user in the user group, and then the user group with the probability value larger than the preset probability value is screened out from the user group based on the probability value to serve as the user group to be pushed. Therefore, the user quality of the user group to be pushed, which is screened out through the target marketing model, is higher, and the product pushing information of the credit product is sent to the users in the user group to be pushed, which is screened out through the target marketing model, under the condition that the marketing expense budget is limited, so that the pushing success rate of the credit product can be improved, and meanwhile, the credit granting rate of the users can also be improved.

In the embodiment, a plurality of target task prediction modules are trained end to end through a network architecture of multi-task learning, and a total loss value is calculated according to the loss value of each target task prediction module; and then, carrying out iterative training on the first initial model based on the total loss value until the model is converged to obtain a target marketing model. Therefore, by adopting the multi-task learning framework, each target task prediction module can fully utilize the information of the user in the front borrowing link (including clicking, real name and credit applying) and the prediction accuracy of each target task prediction module in the model is improved. Therefore, the loan probability of the user is predicted based on the target marketing model, the high-quality user can be accurately circled from the user group to push credit products, the waste of marketing budget is avoided, meanwhile, the product pushing effect can be improved, and the return rate is improved.

Referring to fig. 7, fig. 7 is a schematic flowchart illustrating a model training method according to another embodiment of the present application. The model training method provided by the embodiment of the present application will be described in detail below with reference to fig. 7. The model training method may include the steps of:

step S210: the method comprises the steps of obtaining a training sample set, wherein each training sample in the training sample set carries a plurality of target labels, the target labels correspond to behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user completes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data are extracted from a target credit platform.

Step S220: inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user.

In this embodiment, the contents of the foregoing example can be referred to in the detailed implementation of step S210 to step S220.

Step S230: and acquiring the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task according to a cross entropy loss function, and taking the difference degree as a second loss value corresponding to each behavior prediction task.

Wherein the cross entropy loss function can be expressed as the following formula:

L _CE,j cross entropy, i.e. second loss value, p, corresponding to the jth behavior prediction task _ij Predicting labels, Y, corresponding to j behavior prediction tasks for the predicted training sample i _ij And (4) a target label corresponding to the j-th behavior prediction task for the training sample i, wherein N is the number of samples.

Therefore, the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task can be measured through the cross entropy loss function, and the second loss value corresponding to each behavior prediction task is obtained through calculation. And moreover, the loss value is calculated based on the cross entropy loss function, so that the problems of gradient dispersion, learning rate reduction and the like can be avoided when gradient reduction calculation is carried out.

Step S240: and acquiring the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task according to a symmetrical cross entropy loss function, and taking the difference degree as a third loss value corresponding to each behavior prediction task.

Wherein the symmetric cross entropy function can be expressed as the following formula:

L _{noise-robust,j} the symmetric cross entropy corresponding to the jth behavior prediction task, i.e. the third loss value, may also be referred to as the noise robust loss value, p _ij Predicting labels, Y, corresponding to j behavior prediction tasks for the predicted training sample i _ij A target label corresponding to a j behavior prediction task for a training sample i, wherein N isThe number of samples.

It can be understood that there may be a mis-labeling situation due to the multiple target labels carried by each training sample in the acquired training sample set. For example, in most services, only the conversion within a short time window is generally counted, for example, only the target behavior corresponding to "send success- > click" completed by the user within 2 days is counted, the user is classified as a positive sample, and after 2 days, the target behavior corresponding to "send success- > click" completed by the user is classified as a negative sample, which may result in inaccurate labeling of a part of samples.

In the embodiment, when the cross entropy loss function is used for calculating the second loss value corresponding to each behavior prediction task, the cross entropy loss function causes the first initial model to give higher learning weight to the training samples with smaller predicted probability values, so that model fitting is accelerated. However, in the noise label, the prediction probability of the noise label is small, and the cross entropy loss function enables the model to fit the noise label, so that the model overfitts noise, and finally the model prediction accuracy of the trained first initial model is affected. Therefore, the cross entropy can be symmetrically enhanced by calculating the third loss value by combining the symmetric cross entropy loss function, and meanwhile, the first initial model can be inhibited from learning the training sample with smaller predicted probability value, so that the noise overfitting of the first initial model is inhibited, and the prediction accuracy of the trained first initial model is improved.

In other possible embodiments, the third loss value may be calculated by using other fault-tolerant loss functions such as Mean Absolute Error (MAE), and a label revision mechanism such as Iterative Cross Learning (ICL) and Joint Optimization (Joint Optimization) may be used to suppress learning of the training sample with a small predicted probability by the first initial model, so as to suppress model noise overfitting.

Step S250: and carrying out weighted summation on the second loss value corresponding to each behavior prediction task and the third loss value corresponding to each behavior prediction task to obtain the first loss value corresponding to each behavior prediction task.

Further, the second loss value corresponding to each behavior prediction task and the third loss value corresponding to each behavior prediction task may be subjected to weighted summation according to a second preset weight corresponding to the second loss value and a third preset weight corresponding to the third loss value, so as to obtain a first loss value corresponding to each behavior prediction task.

Specifically, the first loss value corresponding to each behavior prediction task may be calculated by the following formula:

Lj＝a _j *L _CE,j +b _j *L _{noise-robust,j}

wherein Lj is a first loss value, a _j A second preset weight of a second loss value corresponding to the jth behavior prediction task, b _j And a third preset weight of a third loss value corresponding to the jth behavior prediction task.

It can be understood that the second preset weights corresponding to different behavior prediction tasks may be the same or different, and similarly, the third preset weights corresponding to different behavior prediction tasks may be the same or different, and specifically, the weight value may be adjusted according to the actual application requirement, which is not limited in this embodiment. Therefore, the problems of overfitting and under-fitting of model noise can be well balanced by introducing the second preset weight and the third preset weight, and the training sample with low prediction probability is prevented from being learned by over-suppression of the symmetric cross entropy loss function.

Step S260: and determining a total loss value based on the first loss value corresponding to each behavior prediction task.

Step S270: and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

In this embodiment, the contents of the foregoing example can be referred to in the detailed implementation of step S260 to step S270.

In the embodiment, a multitask learning framework is adopted, and the accuracy of the final conversion link is improved by fully utilizing the preamble link samples. Meanwhile, the influence caused by inaccurate sample labels and difficultly classified samples is relieved by combining a noise robust loss function, so that the accuracy of predicting the loan intention of the user by the target marketing model obtained by training is improved.

Referring to fig. 8, fig. 8 is a schematic flowchart illustrating a model training method according to another embodiment of the present disclosure. The model training method provided in the embodiments of the present application will be described in detail below with reference to fig. 8. The model training method may include the steps of:

step S310: the method comprises the steps of obtaining a training sample set, wherein each training sample in the training sample set carries a plurality of target labels, the target labels correspond to behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user completes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data are extracted from a target credit platform.

Step S320: inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user.

Step S330: and determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task.

Step S340: and determining a total loss value based on the first loss value corresponding to each behavior prediction task.

Step S350: and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

In this embodiment, the contents of the foregoing example can be referred to in the detailed implementation of step S310 to step S350.

Step S360: and inputting the user characteristic data of each first target user in the first target user group into the target marketing model to obtain a target probability value of each first target user for completing each behavior prediction task.

In this embodiment, the target marketing model may be deployed on a spark computing platform, and based on this, user data of a first target user group with an active large disk may be obtained through a spark task, and the user data is subjected to data preprocessing, and then the user data subjected to data preprocessing is subjected to feature data extraction, so as to obtain user feature data of each first target user in the first target user group. The manner of extracting the user feature data of the first target user is similar to the principle of extracting the sample user feature data of the sample user in the model training stage, and reference may be made to the contents in the foregoing embodiments, which are not described herein again. Wherein the first target group of users with the large disk active may include users who have gone online to the credit APP for the first target period of time or have gone online to the credit APP or an offline credit company associated with the credit APP for the second target period of time.

Based on the above, the user characteristic data of each first target user in the first target user group is input into the target marketing model, and a target probability value of each first target user for completing each behavior prediction task is obtained. That is, it is possible to predict a target probability value of each first target user clicking a relevant link in push information after receiving the push information of a credit product, a target probability value of completing real name registration in a credit APP, a target probability value of completing credit extension at the credit APP, and a target probability value of completing loan at the credit APP.

Step S370: and screening a second target user group meeting the target pushing condition corresponding to the target product from the first target user group according to the target probability value of each first target user for completing each behavior prediction task.

In some embodiments, referring to fig. 9, step S370 may include the contents of steps S371 to S373:

step S371: and determining a behavior prediction task corresponding to the target product from the plurality of behavior prediction tasks as a behavior prediction task to be screened.

In this embodiment, the target product may be a credit product, and in practical applications, the behavior prediction task corresponding to the credit product is generally "sending success- > borrowing", that is, for the credit product, it is generally to predict loan willingness, that is, borrowing probability value, of each first target user in the first target user group. Therefore, "send success- > borrowing" can be used as a behavior prediction task to be screened.

In other embodiments, the behavior prediction task to be screened may also be another behavior prediction task, which may be specifically set according to a product requirement of a target product, and this embodiment does not limit this.

Step S372: and sequencing the first target users in the first target user group according to the sequence of the target probability values of the first target users completing the behavior prediction task to be screened from the largest to the smallest to obtain a first target user sequence.

Further, after the behavior prediction task to be screened is determined, the target probability values of the behavior prediction tasks to be screened completed by each first target user may be sorted in the order from large to small, and the first target users in the first target user group are sorted to obtain a first target user sequence. Wherein, the target probability value is p4 mentioned in the foregoing embodiment. Obviously, in the first target user sequence, the earlier the loan intention of the first target user is higher.

In some embodiments, the first target user sequence may be acquired every preset time, and the first target user sequence acquired each time is stored in the target database for calling in product push of subsequent credit products.

Step S373: and obtaining the first M first target users in the first target user sequence to obtain the second target user group, wherein M is a positive integer.

The value of M may be determined by the operation policy corresponding to the target product, and the marketing budgets corresponding to different target products are different, so that the operation policies formulated for different target products are also different. It can be understood that the higher the marketing budget of the target product is, the larger the value of M in the operation policy corresponding to the target product is, that is, the credit product can be pushed to more users.

Optionally, since the loan intention of the first target user is higher in the first target user sequence, the top M first target users in the first target user sequence may be obtained to obtain the second target user group.

Step S380: and pushing the push information corresponding to the target product to the electronic equipment corresponding to the user in the second target user group.

And finally, after the second target user group is determined, pushing information corresponding to the target product to the electronic equipment corresponding to the users in the second target user group in a preset pushing mode. The preset pushing mode can be a short message pushing mode, a mail pushing mode or an APP popup pushing mode. Therefore, the users in the second target user group are the users with higher borrowing probability, so that the effect of pushing credit products to the group is better, and meanwhile, the waste of marketing budget cost caused by pushing credit products to the users with lower borrowing probability is avoided.

In some embodiments, in a case that the marketing budget is sufficient, the push information corresponding to the target product may be pushed to the electronic devices corresponding to the users of the second target user group at a first push frequency, and the push information corresponding to the target product may be pushed to the electronic devices corresponding to the other users in the first target user sequence except the second target user group at a second push frequency. The first push frequency is greater than the second push frequency, for example, the first push frequency is pushed four times every two days, and the second push frequency is pushed only 1 time, so that the push information corresponding to the target product is pushed to the users of the second target user group for many times, the problems that the users cannot know the existence of the target product and the like because the users do not see the push information can be prevented, and the probability that the users see the push information of the target product is improved; meanwhile, the push information is sent to other users with low borrowing intention for one time, so that the waste of marketing budget is avoided.

In some embodiments, a second target user group can be screened out from the target database every specified time, and credit products are pushed, namely, the timed marketing and pushing of the products are realized.

In the embodiment, by combining the multitask learning and the noise robust loss function method, the information of the user in a borrowing preposition link (including clicking, real name and credit applying) can be fully utilized, and the problems of label noise and difficultly classified samples are relieved, so that the prediction accuracy of the target marketing model on the user behavior can be improved, and the prediction accuracy of the target marketing model on the borrowing willingness of the user is improved; on the basis of the target probability value obtained through prediction, the first M first target users with higher probability values are screened out to push target products; therefore, the waste of the product marketing budget is avoided, and the product pushing effect is guaranteed.

Referring to fig. 10, a block diagram of a model training apparatus 400 according to an embodiment of the present application is shown. The apparatus 400 may include: a training sample acquisition module 410, a label prediction module 420, a first loss value determination module 430, a total loss value determination module 440, and a model training module 450.

The training sample obtaining module 410 is configured to obtain a training sample set, where each training sample in the training sample set carries a plurality of target labels, the plurality of target labels correspond to a plurality of behavior prediction tasks one to one, each training sample includes sample user feature data of a sample user, and the target label is used to represent whether the sample user completes a target behavior corresponding to the behavior prediction task.

The label prediction module 420 is configured to input each sample user feature data into a first initial model, to obtain a prediction label corresponding to each behavior prediction task, where the prediction label is used to represent a probability value of the sample user completing a target behavior corresponding to the behavior prediction task.

The first loss value determining module 430 is configured to determine a first loss value corresponding to each behavior prediction task according to a difference degree between a prediction tag corresponding to each behavior prediction task and a target tag corresponding to each behavior prediction task.

The total loss value determining module 440 is configured to determine a total loss value based on the first loss value corresponding to each behavior prediction task.

The model training module 450 is configured to iteratively train the first initial model according to the total loss value until a first target condition is met, and obtain the trained first initial model as a target marketing model.

In some embodiments, the first initial model comprises a plurality of goal task prediction modules, each of the behavior prediction tasks being associated with at least one of the goal task prediction modules, and the tag prediction module may be specifically configured to: respectively inputting each sample user characteristic data into each target task prediction module to obtain a first probability value output by each target task prediction module; and acquiring the product of first probability values output by the target task prediction module associated with each behavior prediction task, and taking the product as a prediction label corresponding to each behavior prediction task.

In some embodiments, the first loss value determination module may include: the device comprises a cross entropy loss value acquisition unit, a symmetrical cross entropy loss value acquisition unit and a first loss value acquisition unit. The cross entropy loss value obtaining unit may be configured to obtain, according to a cross entropy loss function, a difference degree between a prediction tag corresponding to each behavior prediction task and a target tag corresponding to each behavior prediction task, as a second loss value corresponding to each behavior prediction task. The symmetric cross entropy loss value obtaining unit may be configured to obtain, according to a symmetric cross entropy loss function, a difference degree between a prediction tag corresponding to each behavior prediction task and a target tag corresponding to each behavior prediction task, as a third loss value corresponding to each behavior prediction task. The first loss value obtaining unit may be configured to perform weighted summation on the second loss value corresponding to each behavior prediction task and the third loss value corresponding to each behavior prediction task to obtain a first loss value corresponding to each behavior prediction task.

In some embodiments, the model training apparatus 400 may further include: the system comprises a target probability acquisition module, a user group screening module and a product pushing module. The target probability obtaining module may be configured to perform iterative training on the first initial model according to the total loss value until a first target condition is met, obtain the trained first initial model, and input user feature data of each first target user in a first target user group to the target marketing model after the trained first initial model is used as the target marketing model, so as to obtain a target probability value that each first target user completes each behavior prediction task. The user group screening module may be configured to screen a second target user group meeting a target pushing condition corresponding to a target product from the first target user group according to a target probability value of each first target user completing each behavior prediction task. The product pushing module may be configured to push pushing information corresponding to the target product to electronic devices corresponding to users in the second target user group.

In this way, the user group screening module may be specifically configured to determine, from the behavior prediction tasks, a behavior prediction task corresponding to the target product as a behavior prediction task to be screened; sequencing the first target users in the first target user group according to the sequence that the target probability value of each first target user for completing the behavior prediction task to be screened is from large to small to obtain a first target user sequence; and obtaining the first M first target users in the first target user sequence to obtain the second target user group, wherein M is a positive integer.

In some embodiments, the training sample acquisition module 410 may be specifically configured to: acquiring a sample data set from a target credit platform, wherein the sample data set comprises user attribute data and historical behavior data of each sample user in a sample user group; extracting the user attribute data of each sample user and the characteristic data of the historical behavior data to obtain sample user characteristic data corresponding to each sample user; determining whether each sample user completes the target behavior corresponding to each behavior prediction task within a target time length according to historical behavior data of each sample user to obtain a plurality of target labels corresponding to each sample user; and adding a plurality of target labels corresponding to each sample user to the sample user characteristic data corresponding to each sample user to obtain the training sample set.

In some embodiments, the total loss value determining module 440 may be specifically configured to perform weighted summation on the first loss values corresponding to each behavior prediction task to obtain the total loss value.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

To sum up, a training sample set is obtained, each training sample in the training sample set carries a plurality of target labels, the plurality of target labels correspond to a plurality of behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user completes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data is extracted from a target credit platform; inputting the characteristic data of each sample user into the first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user; determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction label corresponding to each behavior prediction task and the target label corresponding to each behavior prediction task; determining a total loss value based on a first loss value corresponding to each behavior prediction task; and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model. In this way, a multi-task learning framework is adopted, the prediction capability of the model to a plurality of behavior prediction tasks is trained, a total loss value is determined based on the loss value of each behavior prediction task during prediction, and iterative training of the model is performed based on the total loss value; the method realizes information sharing and mutual complementation among a plurality of behavior prediction tasks, can improve the prediction performance of each behavior prediction task, further comprehensively improves the prediction capability of the model on the behaviors of the user, further enables the quality of the user screened based on the target marketing model to be higher, can accurately deduct high-quality users in a target credit platform from a user group to push credit products, avoids the waste of marketing budget, and can improve the product pushing effect. .

A computer device provided by the present application will be described with reference to fig. 11.

Referring to fig. 11, fig. 11 shows a block diagram of a computer device 500 according to an embodiment of the present application, and the method according to the embodiment of the present application may be executed by the computer device 500. The computer device may be an electronic terminal with a data processing function, and the electronic terminal includes, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart watch, an e-book reader, an MP3 (Moving Picture Experts Group Audio Layer III, moving Picture Experts compression standard Audio Layer 3) player, an MP4 (Moving Picture Experts Group Audio Layer IV, moving Picture Experts compression standard Audio Layer 4) player, a smart home device, and the like; of course, the computer device may also be a server, the server may also be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Network acceleration service (CDN), a big data and artificial intelligence platform, and the like.

The computer device 500 in the embodiments of the present application may include one or more of the following components: a processor 501, a memory 502, and one or more applications, wherein the one or more applications may be stored in the memory 502 and configured to be executed by the one or more processors 501, the one or more programs configured to perform a method as described in the aforementioned method embodiments.

Processor 501 may include one or more processing cores. The processor 501 connects various parts throughout the computer device 500 using various interfaces and lines, and performs various functions of the computer device 500 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 502, and calling data stored in the memory 502. Alternatively, the processor 501 may be implemented in hardware using at least one of Digital Signal Processing (DSP), field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 501 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may be integrated into the processor 501, and implemented by a single communication chip.

The Memory 502 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 502 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 502 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the computer device 500 in use (such as the various correspondences described above), and so on.

In the several embodiments provided in the present application, the coupling or direct coupling or communication connection between the modules shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be in an electrical, mechanical or other form.

Referring to fig. 12, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 600 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.

The computer-readable storage medium 600 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, computer-readable storage medium 600 comprises a non-transitory computer-readable medium. The computer readable storage medium 600 has storage space for program code 610 for performing any of the method steps of the method described above. The program code can be read from and written to one or more computer program products. The program code 610 may be compressed, for example, in a suitable form.

In some embodiments, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of the electronic device from the computer-readable storage medium, and the processor executes the computer instructions to cause the electronic device to perform the steps in the above-mentioned method embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method of model training, the method comprising:

acquiring a training sample set, wherein each training sample in the training sample set carries a plurality of target labels, the plurality of target labels correspond to a plurality of behavior prediction tasks one by one, each training sample comprises sample user characteristic data of a sample user, the target labels are used for representing whether the sample user finishes target behaviors corresponding to the behavior prediction tasks, and the sample user characteristic data are extracted from a target credit platform;

inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, wherein the prediction label is used for representing the probability value of the target behavior corresponding to the behavior prediction task completed by the sample user;

determining a first loss value corresponding to each behavior prediction task according to the difference degree between a prediction label corresponding to each behavior prediction task and a target label corresponding to each behavior prediction task;

determining a total loss value based on a first loss value corresponding to each behavior prediction task;

and performing iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

2. The method of claim 1, wherein the first initial model comprises a plurality of objective task prediction modules, each behavior prediction task is associated with at least one objective task prediction module, and wherein inputting each sample user characteristic data into the first initial model results in a prediction label corresponding to each behavior prediction task, comprises:

respectively inputting each sample user characteristic data into each target task prediction module to obtain a first probability value output by each target task prediction module;

and acquiring the product of first probability values output by the target task prediction module associated with each behavior prediction task, and taking the product as a prediction label corresponding to each behavior prediction task.

3. The method of claim 1, wherein determining the first loss value for each behavior prediction task according to a degree of difference between the prediction tag corresponding to each behavior prediction task and the target tag corresponding to each behavior prediction task comprises:

according to a cross entropy loss function, obtaining the difference degree between a prediction label corresponding to each behavior prediction task and a target label corresponding to each behavior prediction task, and taking the difference degree as a second loss value corresponding to each behavior prediction task;

according to a symmetrical cross entropy loss function, obtaining the difference degree between a prediction label corresponding to each behavior prediction task and a target label corresponding to each behavior prediction task, and using the difference degree as a third loss value corresponding to each behavior prediction task;

and carrying out weighted summation on the second loss value corresponding to each behavior prediction task and the third loss value corresponding to each behavior prediction task to obtain the first loss value corresponding to each behavior prediction task.

4. The method according to any one of claims 1-3, wherein after iteratively training the first initial model according to the total loss value until a first target condition is satisfied, obtaining the trained first initial model as a target marketing model, the method further comprises:

inputting user characteristic data of each first target user in a first target user group into the target marketing model to obtain a target probability value of each first target user for completing each behavior prediction task;

screening a second target user group which meets target pushing conditions corresponding to target products from the first target user groups according to the target probability value of each behavior prediction task completed by each first target user;

and pushing the push information corresponding to the target product to the electronic equipment corresponding to the user in the second target user group.

5. The method of claim 4, wherein the screening out, from the first target user groups, a second target user group meeting target pushing conditions corresponding to target products according to the target probability value of each first target user completing each behavior prediction task comprises:

determining a behavior prediction task corresponding to the target product from the behavior prediction tasks as a behavior prediction task to be screened;

sequencing the first target users in the first target user group according to the sequence that the target probability value of each first target user for completing the behavior prediction task to be screened is from large to small to obtain a first target user sequence;

and obtaining the first M first target users in the first target user sequence to obtain the second target user group, wherein M is a positive integer.

6. The method of any one of claims 1-3, wherein the obtaining a training sample set comprises:

obtaining a sample data set from the target credit platform, wherein the sample data set comprises user attribute data and historical behavior data of each sample user in a sample user group;

extracting the user attribute data of each sample user and the characteristic data of the historical behavior data to obtain sample user characteristic data corresponding to each sample user;

determining whether each sample user completes the target behavior corresponding to each behavior prediction task within a target time length according to historical behavior data of each sample user to obtain a plurality of target labels corresponding to each sample user;

and adding a plurality of target labels corresponding to each sample user to the sample user characteristic data corresponding to each sample user to obtain the training sample set.

7. The method according to any one of claims 1-3, wherein determining a total loss value based on the first loss value corresponding to each of the behavior prediction tasks comprises:

and carrying out weighted summation on the first loss value corresponding to each behavior prediction task to obtain the total loss value.

8. A model training apparatus, the apparatus comprising:

a training sample acquisition module, configured to acquire a training sample set, where each training sample in the training sample set carries multiple target tags, the multiple target tags are in one-to-one correspondence with multiple behavior prediction tasks, each training sample includes sample user feature data of a sample user, the target tags are used to represent whether the sample user completes a target behavior corresponding to the behavior prediction task, and the sample user feature data is extracted from a target credit platform;

the label prediction module is used for inputting the characteristic data of each sample user into a first initial model to obtain a prediction label corresponding to each behavior prediction task, and the prediction labels are used for representing the probability value of the sample user for completing the target behaviors corresponding to the behavior prediction tasks;

the first loss value determining module is used for determining a first loss value corresponding to each behavior prediction task according to the difference degree between the prediction tag corresponding to each behavior prediction task and the target tag corresponding to each behavior prediction task;

a total loss value determination module, configured to determine a total loss value based on a first loss value corresponding to each behavior prediction task;

and the model training module is used for carrying out iterative training on the first initial model according to the total loss value until a first target condition is met, and obtaining the trained first initial model as a target marketing model.

9. A computer device, comprising:

one or more processors;

a memory;

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that a program code is stored in the computer-readable storage medium, which program code can be called by a processor to perform the method according to any of claims 1-7.