WO2024040869A1

WO2024040869A1 - Multi-task model training method, information recommendation method, apparatus, and device

Info

Publication number: WO2024040869A1
Application number: PCT/CN2023/074122
Authority: WO
Inventors: 王震; 张文慧; 吴志华; 于佃海
Original assignee: 北京百度网讯科技有限公司
Priority date: 2022-08-24
Filing date: 2023-02-01
Publication date: 2024-02-29
Also published as: CN115081630A

Abstract

The present disclosure provides a multi-task model training method, relating to the technical field of artificial intelligence, and in particular, to the technical fields of deep learning, cloud computing, multi-task parallel processing, and data searching. A specific implementation solution comprises: inputting sample behavior data of a sample object into a shared sub-model, to obtain behavior characteristic information of the sample object, the behavior characteristic information comprising multiple pieces of behavior characteristic sub-information; obtaining a sub-loss value of a task processing sub-model according to the behavior characteristic sub-information associated with the task processing sub-model; determining a target gradient value corresponding to the sub-loss value, according to the sub-loss value of the task processing sub-model; determining a weight value corresponding to the sub-loss value, according to multiple target gradient values; and training a multi-task model according to multiple sub-loss values and the multiple weight values corresponding to the multiple sub-loss values. The present disclosure further provides an information recommendation method, an apparatus, an electronic device, and a storage medium.

Description

Training methods, information recommendation methods, devices and equipment for multi-task models

This application claims priority from Chinese Patent Application No. 202211015938.1 submitted on August 24, 2022, the content of which is hereby incorporated by reference.

Technical field

The present disclosure relates to the technical field of artificial intelligence, especially to the technical fields of deep learning, cloud computing, multi-task parallel processing, data search and other technical fields. More specifically, the present disclosure provides a multi-task model training method, information recommendation method, device, electronic device and storage medium.

Background technique

With the development of artificial intelligence technology, deep learning models can be used to handle multiple different tasks simultaneously. During the training process, in order to make the parameters learned by the deep learning model have certain applicability to each task, the learning speed of each task can be adjusted.

Contents of the invention

The present disclosure provides a multi-task model training method, information recommendation method, device, equipment and storage medium.

According to one aspect of the present disclosure, a training method for a multi-task model is provided. The multi-task model includes a plurality of task processing sub-models and a shared sub-model. The method includes: inputting sample behavior data of a sample object into the shared sub-model to obtain The behavioral characteristic information of the sample object, where the behavioral characteristic information includes multiple behavioral characteristic sub-information; according to the behavioral characteristic sub-information related to the task processing sub-model, the sub-loss value of the task processing sub-model is obtained; according to the sub-loss value of the task processing sub-model Loss value, determine the target gradient value corresponding to the sub-loss value; determine the weight value corresponding to the sub-loss value based on multiple target gradient values; and determine multiple sub-loss values based on the multiple sub-loss values and multiple weight values corresponding to the multiple sub-loss values. , train a multi-task model.

According to another aspect of the present disclosure, an information recommendation method is provided. The method includes: inputting the target behavior data of the target object into a multi-task model to obtain multiple output results; and recommending target information to the target object based on the multiple output results. , wherein the multi-task model is trained according to the method provided by this disclosure.

According to another aspect of the present disclosure, a training device for a multi-task model is provided. The multi-task model includes a plurality of task processing sub-models and a shared sub-model. The device includes: a first acquisition module for obtaining samples of sample objects. The behavioral data is input into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes multiple behavioral characteristic sub-information; the second acquisition module is used to obtain the task based on the behavioral characteristic sub-information related to the task processing sub-model. Process the sub-loss values of the sub-model; the first determination module is used to process the sub-loss values of the sub-model according to the task and determine the target gradient value corresponding to the sub-loss value; the second determination module is used to determine the target gradient value according to multiple target gradient values. Determine weight values corresponding to the sub-loss values; and a training module for training a multi-task model based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.

According to another aspect of the present disclosure, an information recommendation device is provided. The device includes: a third acquisition module for inputting the target behavior data of the target object into a multi-task model to obtain multiple output results; a recommendation module for According to multiple output results, target information is recommended to the target object, wherein the multi-task model is trained according to the device provided by the present disclosure.

According to another aspect of the present disclosure, an electronic device is provided, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions that can be executed by the at least one processor, and the instructions are At least one processor executes to enable at least one processor to execute the method provided in accordance with the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a method provided according to the present disclosure.

According to another aspect of the present disclosure, a computer program product is provided, including a computer program that, when executed by a processor, implements a method provided according to the present disclosure.

It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

Description of drawings

The accompanying drawings are used to better understand the present solution and do not constitute a limitation of the present disclosure. in:

Figure 1 is a flow chart of a training method for a multi-task model according to an embodiment of the present disclosure;

Figure 2 is a schematic diagram of obtaining a sub-loss value of a task processing sub-model according to an embodiment of the present disclosure;

Figure 3 is a schematic diagram of determining a target gradient value corresponding to a sub-loss value according to an embodiment of the present disclosure;

Figure 4 is a flow chart of a training method for a multi-task model according to one embodiment of the present disclosure;

Figure 5 is a flow chart of an information recommendation method according to another embodiment of the present disclosure;

Figure 6 is a block diagram of a training device for a multi-task model according to an embodiment of the present disclosure;

Figure 7 is a block diagram of an information recommendation device according to another embodiment of the present disclosure; and

FIG. 8 is a block diagram of an electronic device that can apply a multi-task model training method and/or an information recommendation method according to one embodiment of the present disclosure.

Detailed ways

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding and should be considered to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

In the technical solution of this disclosure, the collection, storage, use, processing, transmission, provision, disclosure and application of user personal information are in compliance with relevant laws and regulations, necessary confidentiality measures have been taken, and do not violate public order and good customs . In the technical solution of the present disclosure, the user's authorization or consent is obtained before obtaining or collecting the user's personal information.

Deep learning frameworks can provide developers with relevant model interfaces. The deep learning framework can also provide auxiliary tools for analyzing scenario problems from the developer's perspective to improve development efficiency and development experience. Deep learning frameworks may include, for example, PaddlePaddle, Tensorflow, and so on. Among these frameworks, for example, the Flying Paddle framework can support deep learning models and machine learning models.

For deep learning models, the learning forms of the model can include supervised learning and unsupervised learning, etc. Supervised learning can include, for example, multi-task learning. The task of optimizing more than one objective function can be called multi-task learning. The counterpart to multi-task learning is single-task learning. For single-task learning, the optimization of an objective function can be studied. In order to achieve multi-task learning, it is necessary to process data from multiple tasks and balance the learning speed between multiple tasks.

Multi-task learning has a wide range of applications and can be applied to natural language processing, computer vision and information recommendation and other technical fields. In some embodiments, taking the application of multi-task learning in the field of information recommendation as an example, the behavioral data of an object may involve multiple tasks. For example, for short videos, determining the video completion rate, determining the click rate, determining the attention rate, determining the like rate, etc. can be regarded as one task respectively.

In some embodiments, multi-task learning can be implemented by a multi-task model. In the process of training a multi-task model, a certain balance can be maintained between different tasks so that the parameters learned by the multi-task model have certain applicability to each task. For example, in order to balance the learning speed of different tasks, the weight of the tasks can be set. However, during the training process, the gradient changes of different sub-models used to perform multiple tasks are difficult to represent with a constant. Even if the gradient changes of some sub-models can be represented by constants, multiple sets of experiments need to be set up for one sub-model for debugging. Therefore, the time cost of training multi-task models is relatively high.

Figure 1 is a flow chart of a training method for a multi-task model according to an embodiment of the present disclosure.

As shown in FIG. 1 , the method 100 may include operations S110 to S150.

In embodiments of the present disclosure, the multi-tasking model includes multiple task processing sub-models and sharing sub-models. For example, each task processing submodel is used to process one task. As another example, the shared submodel can be used to extract features of sample data.

In operation S110, the sample behavior data of the sample object is input into the shared sub-model to obtain the behavior characteristic information of the sample object.

In the embodiment of the present disclosure, the sample behavior data may be the behavior data of the sample object within a sample period. For example, within the sample period, the historical behavior performed by the sample object on one or more historical information can be collected. In one example, the historical information may be, for example, short videos that the sample subject has watched. These historical behaviors may include, for example, clicks, followings, likes, comments, and other behaviors.

In the embodiment of the present disclosure, the behavioral characteristic information includes a plurality of behavioral characteristic sub-information.

For example, a behavioral characteristic sub-information can correspond to a behavior.

In operation S120, a sub-loss value of the task processing sub-model is obtained based on the behavioral characteristic sub-information related to the task processing sub-model.

In the embodiment of the present disclosure, the task processing sub-model can be used to process the behavioral characteristic sub-information to obtain the output result. Based on the output results, determine the sub-loss value of the task processing sub-model.

For example, the output result can indicate the probability that the sample object performs an action on the sample information. In one example, the sample information may be a short video that the sample subject has not watched.

For another example, based on supervised learning, semi-supervised learning or unsupervised learning, various loss functions can be used to determine the sub-loss value of the task processing sub-model based on the output results.

In operation S130, a target gradient value corresponding to the sub-loss value is determined according to the sub-loss value of the task processing sub-model.

In the embodiment of the present disclosure, the target gradient value may be a gradient value of the shared sub-model.

For example, the task processing sub-model may include at least one task processing layer. As another example, the shared submodel may include at least one shared layer

For example, according to the sub-loss value of the task processing sub-model, at least one gradient value of the task processing sub-model can be determined, or at least one gradient value of the shared sub-model can be determined. The target gradient value corresponding to the sub-loss value may be determined based on at least one gradient value of the shared sub-model.

For example, any one of the at least one gradient value of the shared sub-model can be used as the target gradient value.

In operation S140, a weight value corresponding to the sub-loss value is determined according to the plurality of target gradient values.

In embodiments of the present disclosure, according to multiple target gradient values and the target gradient value corresponding to the sub-loss value, the weight value corresponding to the sub-loss value may be determined.

For example, various operations can be performed based on multiple target gradient values to obtain the operation results of the gradient values. Various operations may include, for example, summation, multiplication, and the like. For another example, by performing various operations based on the calculation result of the gradient value and the target gradient value corresponding to the sub-loss value, the weight value corresponding to the sub-loss value can be obtained.

In operation S150, a multi-task model is trained according to a plurality of sub-loss values and a plurality of weight values respectively corresponding to the plurality of sub-loss values.

In the embodiment of the present disclosure, a loss value can be obtained based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values. Based on the loss value, a multi-task model can be trained.

For example, various operations can be performed based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values to obtain the loss value. In one example, various operations may include weighted sums, weighted averages, and the like.

For example, based on the gradient descent algorithm or the backpropagation algorithm, a multi-task model can be trained based on the loss value.

Through the embodiment of the present disclosure, the final loss value is determined based on multiple target gradient values. Pay attention to It eliminates the difference in learning speed between different tasks and can effectively balance the learning speed of different tasks, making the parameters learned by the multi-task model have certain applicability to different tasks. The multi-task model obtained thus has stronger multi-task parallel processing capabilities, improves data processing efficiency when hardware resources are preset or limited, and can recommend more accurate information based on the object's behavioral data.

It can be understood that the training method of the multi-task model of the present disclosure is described in detail above. The method of obtaining the sub-loss value of the task processing sub-model in the present disclosure will be described in detail below in conjunction with relevant embodiments and Figure 2.

Figure 2 is a schematic diagram of obtaining sub-loss values of a task processing sub-model according to one embodiment of the present disclosure.

As shown in Figure 2, the multi-task model may include a shared sub-model 210 and n task processing sub-models. n is an integer greater than 1. The n task processing sub-models may include, for example, the first task processing sub-model 221, the second task processing sub-model 222, ..., the n-th task processing sub-model 223. In one example, n can be 3.

In the embodiment of the present disclosure, the sample behavior data of the sample object can be input into the shared sub-model to obtain the behavioral characteristic information of the sample object. Behavioral characteristic information may include n behavioral characteristic sub-information.

For example, shared submodel 210 may include multiple shared layers. The multiple shared layers can be the first shared layer, the second shared layer,..., the m-th shared layer, where m is an integer greater than 1. The first shared layer can process sample behavioral data and obtain the first initial behavioral characteristic information. The second shared layer can process the first initial behavioral characteristic information and obtain the second initial behavioral characteristic information. ...The m-th shared layer can process the m-1 initial behavioral feature information and obtain the m-th initial behavioral feature information. The m-th initial behavioral characteristic information can be used as the behavioral characteristic information of the sample object.

For example, the n behavioral characteristic sub-information may include the first behavioral characteristic sub-information, the second behavioral characteristic sub-information, ..., the n-th behavioral characteristic sub-information.

In the embodiment of the present disclosure, according to the behavioral characteristic sub-information related to the task processing sub-model, the sub-loss value of the task processing sub-model can be obtained.

In the embodiment of the present disclosure, the behavioral characteristic sub-information related to the task processing sub-model can be input into the task processing sub-model to obtain the output result of the task processing sub-model. According to the output results, The sub-loss value of the task processing sub-model can be obtained.

For example, the first behavioral characteristic sub-information can be input into the first task processing sub-model 221 to obtain the first output result 2211. The first output result 2211 may indicate the predicted probability of the sample object performing the first behavior. In one example, based on the first output result 2211 and the first sub-label of the sample behavior data, various loss functions can be used to determine the first sub-loss value 2212. The first sub-label can indicate the true probability of the sample object performing the first behavior. Various loss functions may include, for example, Cross Entropy (CE) loss function, L1 loss function, etc.

For example, the second behavioral characteristic sub-information can be input into the second task processing sub-model 222 to obtain the second output result 2221. The second output result 2221 may indicate the predicted probability of the sample object performing the second behavior. In one example, based on the second output result 2221 and the second sub-label of the sample behavior data, various loss functions can be used to determine the second sub-loss value 2222. The second sub-label can indicate the true probability of the sample object performing the second behavior.

For example, the n-th behavioral characteristic sub-information can be input into the n-th task processing sub-model 223 to obtain the n-th output result 2231. The nth output result 2231 may indicate the predicted probability of the sample object performing the nth behavior. In one example, various loss functions are used to determine the n-th sub-loss value 2232 based on the n-th output result 2231 and the n-th sub-label of the sample behavior data. The nth sub-label can indicate the true probability of the sample object performing the nth behavior.

In the embodiment of the present disclosure, manual annotation can be performed on a piece of sample information and sample behavior data 201 to obtain a label for the sample information. The tag can include n sub-tags. The n sub-tags may include, for example, the above-mentioned first sub-tag, second sub-tag, ..., n-th sub-tag.

It can be understood that some implementation methods for obtaining sub-loss values of task processing sub-models are described in detail above. The method of determining the target gradient value corresponding to the sub-loss value will be described in detail below with reference to relevant embodiments and Figure 3.

FIG. 3 is a schematic diagram of determining a target gradient value corresponding to a sub-loss value according to an embodiment of the present disclosure.

As shown in Figure 3, the multi-task model may include a shared sub-model 310 and n task processing sub-models. The n task processing sub-models may include, for example, the first task processing sub-model 321, the second task processing sub-model 322, ..., and the n-th task processing sub-model 323. In an example , n can be 3.

Shared sub-model 310 may also include multiple shared layers. The multiple shared layers can be the first shared layer, the second shared layer,..., the m-th shared layer, where m is an integer greater than 1. As mentioned above, the first shared layer can process sample behavioral data and obtain the first initial behavioral characteristic information. The second shared layer can process the first initial behavioral characteristic information and obtain the second initial behavioral characteristic information. ...The m-th shared layer can process the m-1 initial behavioral feature information and obtain the m-th initial behavioral feature information. The m-th initial behavioral characteristic information can be used as the behavioral characteristic information of the sample object.

In embodiments of the present disclosure, after determining the sub-loss value of the task processing sub-model, based on various methods (such as backpropagation), the target gradient value corresponding to the loss value may be determined. It can be understood that the above detailed description of the first sub-loss value 2212, the second sub-loss value 2222, ..., the n-th sub-loss value 2232 can also be applied to this embodiment, and the present disclosure will not be repeated here.

For example, based on the back propagation algorithm, based on the first sub-loss value 3212 and the parameters of the first task processing sub-model 321, the gradient value of the first task processing sub-model 321 can be determined. Next, based on the parameters of the shared sub-model 310, the m first gradient values of the shared sub-model 310 can be determined. Among the m first gradient values, the first gradient value of the first shared layer can be used as the first target gradient value 3213. It can be understood that in the process of determining the first gradient value of the first shared layer, the relevant parameters of the first shared layer can be used.

For example, based on the back propagation algorithm, the gradient value of the second task processing sub-model 322 can be determined based on the second sub-loss value 3222 and the parameters of the second task processing sub-model 322. Next, based on the parameters of the shared sub-model 310, m second gradient values of the shared sub-model 310 can be determined. Among the m second gradient values, the second gradient value of the first shared layer can be used as the second target gradient value 3223. It can be understood that in the process of determining the second gradient value of the first shared layer, the relevant parameters of the first shared layer can be used.

For example, based on the back-propagation algorithm, the gradient value of the n-th task processing sub-model 323 can be determined according to the n-th sub-loss value 3232 and the parameters of the n-th task processing sub-model 323. Next, based on the parameters of the shared sub-model 310, the m nth gradient values of the shared sub-model 310 can be determined. Among the m nth gradient values, the nth gradient value of the first shared layer can be used as the nth target gradient value 3233. It can be understood that when determining the nth gradient value of the first shared layer In the process, you can use the relevant parameters of the first shared layer.

It can be understood that during the forward propagation process, the first shared layer can process sample behavioral data. During the backpropagation process, the first shared layer can be the last shared layer of multiple shared layers.

In other embodiments, any first gradient value among the m first gradient values may also be used as the first target gradient value. It is also possible to use any second gradient value among the m second gradient values as the second target gradient value. ...You can also use any n-th gradient value among the m n-th gradient values as the n-th target gradient value.

It can be understood that some embodiments of determining the target gradient value corresponding to the sub-loss value are described in detail above. Determining the weight value corresponding to the sub-loss value will be described in detail below with reference to relevant embodiments and Figure 4.

Figure 4 is a flowchart of a training method for a multi-task model according to one embodiment of the present disclosure.

It can be understood that the above-mentioned operation S110 and operation S120 may also be applicable to this embodiment.

In the embodiment of the present disclosure, the sample behavior data of the sample object can be input into the shared sub-model to obtain the behavioral characteristic information of the sample object. For example, behavioral characteristic information includes multiple behavioral characteristic sub-information.

In the embodiment of the present disclosure, the sub-loss value of the task processing sub-model is obtained according to the behavioral characteristic sub-information related to the task processing sub-model. For example, the first sub-loss value loss_1 can be obtained. The first sub-loss value loss_1 can be the sub-loss value of the first task processing sub-model. For example, the second sub-loss value loss_2 can be obtained. The second sub-loss value loss_2 can be the sub-loss value of the second task processing sub-model. For example, the nth sub-loss value loss_n can be obtained. The n-th sub-loss value loss_n may be the sub-loss value of the n-th task processing sub-model. It can be understood that the above-mentioned method of obtaining the first sub-loss value 2212, the second sub-loss value 2222, ..., and the n-th sub-loss value 2312 can also be applied to this embodiment, and the disclosure will not be repeated here.

Next, operations S430, S440, and S451 may be performed.

In operation S430, a target gradient value corresponding to the sub-loss value is determined.

For example, the first target gradient value grad_1 corresponding to the first sub-loss value loss_1 can be determined based on the first sub-loss value loss_1. For another example, the second target gradient value grad_2 corresponding to the second sub-loss value loss_2 can be determined based on the second sub-loss value loss_2. ...For another example, the n-th target gradient value grad_n corresponding to the n-th sub-loss value loss_n can be determined based on the n-th sub-loss value loss_n.

It can be understood that in the process of determining the first target gradient value grad_1, the second target gradient value grad_2, ..., and the nth target gradient value grad_n, the relevant parameters para_last_shared_layer of the first shared layer of the shared sub-model can be used .

It can be understood that the above-mentioned method of determining the first target gradient value 3213, the second target gradient value 3223, ..., the n-th target gradient value 3233 can also be applied to this embodiment, and the disclosure will not be repeated here.

In operation S440, a weight value corresponding to the sub-loss value is determined.

In the embodiment of the present disclosure, the weight value corresponding to the sub-loss value can be determined according to multiple target gradient values.

For example, the processing parameter values can be determined based on multiple target gradient values.

For example, the processing parameter value can be determined based on the first target gradient value grad_1, the second target gradient value grad_2, ..., and the nth target gradient value grad_n.

For example, based on the processing parameter value and the target gradient value corresponding to the sub-loss value, the weight value corresponding to the sub-loss value may be determined.

For example, by processing the target gradient value corresponding to the sub-loss value, the processed target gradient value can be obtained. According to the processing parameter value, the processed target gradient value can be normalized to obtain the normalized gradient value. According to the reciprocal of the normalized gradient value, the weight value corresponding to the sub-loss value can be determined. During the training process, for a task that generates a larger gradient, the weight value of the task can be smaller through the embodiments of the present disclosure. In the final loss value, the proportion of the task decreases, which can effectively balance the learning speed between different tasks.

For example, the i-th weight value w_i corresponding to the i-th sub-loss value loss_i can be determined by the following formula.

i can be an integer greater than or equal to 1, and i can be an integer less than or equal to n.

∑ _i exp(grad_i) can be a processing parameter value. exp(grad_i) can be the i-th processed target gradient value.

It can be understood that according to Formula 1, the first weight value w_1, the second weight value w_2,..., and the nth weight value W_n can be determined.

In an example, the first target gradient value grad_1 is processed to obtain the first A processed target gradient value exp(grad_1). According to the processing parameter value ∑ _i exp(grad_i), normalize the first processed target gradient value exp(grad_1) to get the first normalized gradient value exp(grad_1)/∑ _i exp(grad_i) . The reciprocal of the first normalized gradient value can be used as the first weight value w_1.

In operation S451, a loss value is obtained.

In the embodiment of the present disclosure, the loss value can be obtained based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.

For example, the loss value can be obtained by performing a weighted sum based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.

For example, the first sub-loss value loss_1, the second sub-loss value loss_2,..., the n-th sub-loss value loss_n and the first weight value w_1, the second weight value w_2,..., the n-th weight value can be used w_n, perform weighted summation to obtain the loss value Loss.

In an example, the loss value Loss can be obtained by the following formula:
Loss＝loss_1*w_1+loss_2*w_2+………+loss_n*w_n (Formula 2)

In the embodiment of the present disclosure, a multi-task model can be trained according to the loss value.

For example, the parameters of multiple task processing sub-models and shared sub-models can be adjusted separately according to the loss value to train a multi-task model. In one example, based on the backpropagation algorithm, the loss value can be used to adjust parameters of multiple task processing sub-models and shared sub-models.

Figure 5 is a flow chart of an information recommendation method according to another embodiment of the present disclosure.

As shown in FIG. 5 , the method 500 may include operations S510 to S520.

In operation S510, the target behavior data of the target object is input into the multi-task model to obtain multiple output results.

In embodiments of the present disclosure, the multi-task model may be trained according to the method provided by the present disclosure. For example, a multi-task model can be trained according to method 100.

In embodiments of the present disclosure, the output result may indicate the probability that the target object performs an action on a piece of candidate information.

For example, the candidate information can be short video, image or text information. In one example, an output result may indicate the probability that the target object performs a "click" action on the candidate information. In one example, another output result may indicate the probability that the target object performs a "comment" action on the candidate information.

In operation S520, target information is recommended to the target object according to the plurality of output results.

In the embodiment of the present disclosure, recommended parameters of candidate information are determined based on multiple output results. The target information is determined from the plurality of candidate information according to the recommended parameters of the plurality of candidate information. Recommend targeted information to target audiences.

For example, the output can be normalized to a value greater than 0 and less than 1. Based on multiple normalized output results, various operations can be performed to obtain recommended parameters for a candidate information. In one example, various operations may include, for example, averaging, summing, weighted sums, and the like.

For example, the candidate information with the largest recommendation parameter can be used as the target information and recommended to the target object.

Through the embodiments of the present disclosure, information can be accurately recommended to target objects, information push efficiency can be improved, and user experience can be improved.

It can be understood that in related fields such as information recommendation, a large amount of candidate information can be recalled for the target object first, and then the target information can be determined from the recalled candidate information. Some implementations of determining target information from candidate information are described in detail above, and some implementations of recalling candidate information will be described below.

In some embodiments, candidate information may be determined from a plurality of initial information based on target behavior data of the target object.

For example, target behavior data can be converted into a target vector. Calculate the similarity between the target vector and the feature vector of the initial information. When the similarity is greater than the preset similarity threshold, the initial information corresponding to the similarity can be used as a candidate information.

For example, multiple candidate information may be determined.

It can be understood that the application method of the multi-task model provided by the present disclosure in the field of information recommendation is described in detail above, but the present disclosure is not limited thereto. The multi-task model provided by this disclosure can also be applied to other fields (such as image processing, text processing, audio processing, etc.).

Figure 6 is a block diagram of a training device for a multi-task model according to an embodiment of the present disclosure.

In embodiments of the present disclosure, the multi-tasking model may include multiple task processing sub-models and sharing sub-models.

As shown in FIG. 6 , the device 600 may include a first obtaining module 610 , a second obtaining module 620 , a first determining module 630 , a second determining module 640 and a training module 650 .

The first obtaining module 610 is used to input the sample behavior data of the sample object into the shared sub-model, Obtain the behavioral characteristic information of the sample object. For example, behavioral characteristic information includes multiple behavioral characteristic sub-information.

The second obtaining module 620 is used to obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model.

The first determination module 630 is configured to determine the target gradient value corresponding to the sub-loss value according to the sub-loss value of the task processing sub-model.

The second determination module 640 is used to determine the weight value corresponding to the sub-loss value according to multiple target gradient values.

The training module 650 is used to train a multi-task model based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.

In some embodiments, the second determination module includes: a first determination sub-module, used to determine the processing parameter value according to the plurality of target gradient values; and a second determination sub-module, used to determine the processing parameter value and the sub-loss value according to the processing parameter value and the sub-loss value. The corresponding target gradient value determines the weight value corresponding to the sub-loss value.

In some embodiments, the second determination sub-module includes: a processing unit, used to process the target gradient value corresponding to the sub-loss value, to obtain the processed target gradient value; a normalization processing unit, used to process the target gradient value according to the processing parameter value , normalize the processed target gradient value to obtain a normalized gradient value; and a determination unit used to determine the weight value corresponding to the sub-loss value based on the reciprocal of the normalized gradient value.

In some embodiments, the second obtaining module includes: a first obtaining sub-module, used to input the behavioral characteristic sub-information related to the task processing sub-model into the task processing sub-model, and obtain the output result of the task processing sub-model; and a second obtaining sub-module. Obtain sub-module, which is used to obtain the sub-loss value of the task processing sub-model based on the output results.

In some embodiments, the training module includes: a third obtaining sub-module, used to obtain a loss value according to multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values; and a training sub-module, used to obtain a loss value according to the loss value to train a multi-task model.

In some embodiments, the third obtaining sub-module includes: a weighted summation unit, configured to perform weighted summation based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values, to obtain a loss value.

In some embodiments, the training sub-module includes: an adjustment unit, configured to respectively adjust parameters of multiple task processing sub-models and shared sub-models according to the loss value to train the multi-task model.

FIG. 7 is a block diagram of an information recommendation device according to another embodiment of the present disclosure.

As shown in FIG. 7 , the device 700 may include a third obtaining module 710 and a recommendation module 720 .

The third obtaining module 710 is used to input the target behavior data of the target object into the multi-task model to obtain multiple output results.

The recommendation module 720 is used to recommend target information to target objects based on multiple output results.

For example, the multi-task model is trained according to the device provided by the present disclosure.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

In an embodiment of the present disclosure, an electronic device includes: at least one processor; and a memory communicatively connected with the at least one processor. For example, the memory stores instructions executable by at least one processor, and the instructions are executed by at least one processor, so that the at least one processor can perform the method provided according to the present disclosure.

In embodiments of the present disclosure, the readable storage medium stores computer instructions, and the readable storage medium may be a non-transitory computer-readable storage medium. For example, computer instructions may cause a computer to perform methods provided in accordance with the present disclosure.

In embodiments of the present disclosure, a computer program product includes a computer program that, when executed by a processor, implements a method provided in accordance with the present disclosure. Detailed description will be given below with reference to Figure 8 .

Figure 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in FIG. 8 , the device 800 includes a computing unit 801 that can execute according to a computer program stored in a read-only memory (ROM) 802 or loaded from a storage unit 808 into a random access memory (RAM) 803 Various appropriate actions and treatments. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. Computing unit 801, ROM 802 and RAM 803 are connected to each other via bus 804. Input/output (I/O) Interface 805 is also connected to bus 804.

Multiple components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, optical disk, etc. ; and communication unit 809, such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.

Computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processing processor (DSP), and any appropriate processor, controller, microcontroller, etc. The computing unit 801 performs various methods and processes described above, such as a multi-task model training method and/or an information recommendation method. For example, in some embodiments, the multi-task model training method and/or the information recommendation method may be implemented as a computer software program, which is tangibly included in a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the training method of the multi-task model and/or the information recommendation method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the multi-task model training method and/or the information recommendation method in any other suitable manner (eg, by means of firmware).

Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip implemented in a system (SOC), complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor The processor, which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (Cathode Ray Tube) display or an LCD (Liquid Crystal Display)) for displaying information to the user. ; and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.

The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. in any form or medium Digital data communications (e.g., communications networks) to connect the components of the system to each other. Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.

Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other. The server can be a cloud server, a distributed system server, or a server combined with a blockchain.

It should be understood that various forms of the process shown above may be used, with steps reordered, added or deleted. For example, each step described in the present disclosure can be executed in parallel, sequentially, or in a different order. As long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, there is no limitation here.

The above-mentioned specific embodiments do not constitute a limitation on the scope of the present disclosure. It will be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions are possible depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this disclosure shall be included in the protection scope of this disclosure.

Claims

A training method for a multi-task model. The multi-task model includes multiple task processing sub-models and shared sub-models. The method includes:

Input the sample behavior data of the sample object into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes a plurality of behavioral characteristic sub-information;

Obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model;

According to the sub-loss value of the task processing sub-model, determine the target gradient value corresponding to the sub-loss value;

Determine a weight value corresponding to the sub-loss value according to a plurality of the target gradient values; and

The multi-task model is trained according to a plurality of sub-loss values and a plurality of weight values respectively corresponding to the plurality of sub-loss values.
The method according to claim 1, wherein determining the weight value corresponding to the sub-loss value according to a plurality of the target gradient values includes:

determining processing parameter values based on a plurality of the target gradient values; and

The weight value corresponding to the sub-loss value is determined according to the processing parameter value and the target gradient value corresponding to the sub-loss value.
The method of claim 2, wherein determining the weight value corresponding to the sub-loss value according to the processing parameter value and the target gradient value corresponding to the sub-loss value includes:

Process the target gradient value corresponding to the sub-loss value to obtain the processed target gradient value;

According to the processing parameter value, normalize the processed target gradient value to obtain a normalized gradient value; and

The weight value corresponding to the sub-loss value is determined according to the reciprocal of the normalized gradient value.
The method according to claim 1, wherein obtaining the sub-loss value of the task processing sub-model based on the behavioral characteristic sub-information related to the task processing sub-model includes:

Input the behavioral characteristic sub-information related to the task processing sub-model into the task Process the sub-model to obtain the output result of the task processing sub-model; and

According to the output result, the sub-loss value of the task processing sub-model is obtained.
The method according to claim 1, wherein training the multi-task model according to a plurality of the sub-loss values and a plurality of the weight values corresponding to the plurality of sub-loss values respectively includes:

Obtain a loss value according to a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values; and

According to the loss value, the multi-task model is trained.
The method according to claim 5, wherein obtaining the loss value based on a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values includes:

The loss value is obtained by performing a weighted sum according to the plurality of sub-loss values and the plurality of weight values respectively corresponding to the plurality of sub-loss values.
The method of claim 5, wherein training the multi-task model according to the loss value includes:

According to the loss value, parameters of a plurality of the task processing sub-models and the shared sub-model are respectively adjusted to train the multi-task model.
An information recommendation method includes:

Input the target behavior data of the target object into the multi-task model to obtain multiple output results;

recommend target information to the target object according to the multiple output results,

Wherein, the multi-task model is trained according to the method described in any one of claims 1 to 7.
A training device for a multi-task model. The multi-task model includes multiple task processing sub-models and shared sub-models. The device includes:

The first acquisition module is used to input the sample behavior data of the sample object into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes a plurality of behavioral characteristic sub-information;

The second acquisition module is used to obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model;

A first determination module, configured to determine the target gradient value corresponding to the sub-loss value according to the sub-loss value of the task processing sub-model;

A second determination module, configured to determine the weight value corresponding to the sub-loss value according to a plurality of the target gradient values; and

A training module configured to train the multi-task model according to a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values.
The device according to claim 9, wherein the second determination module includes:

The first determination sub-module is used to determine the processing parameter value according to a plurality of the target gradient values; and

The second determination sub-module is used to determine the weight value corresponding to the sub-loss value according to the processing parameter value and the target gradient value corresponding to the sub-loss value.
The device according to claim 10, wherein the second determination sub-module includes:

A processing unit, configured to process the target gradient value corresponding to the sub-loss value to obtain the processed target gradient value;

A normalization processing unit, configured to normalize the processed target gradient value according to the processing parameter value to obtain a normalized gradient value; and

A determining unit, configured to determine the weight value corresponding to the sub-loss value according to the reciprocal of the normalized gradient value.
The device according to claim 9, wherein the second obtaining module includes:

The first acquisition sub-module is used to input the behavioral characteristic sub-information related to the task processing sub-model into the task processing sub-model, and obtain the output result of the task processing sub-model; and

The second obtaining sub-module is used to obtain the sub-loss value of the task processing sub-model according to the output result.
The device of claim 9, wherein the training module includes:

The third obtaining sub-module is used to obtain a loss value based on a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values; and

A training submodule is used to train the multi-task model according to the loss value.
The device according to claim 13, wherein the third obtaining sub-module includes:

A weighted summation unit is configured to perform weighted summation based on a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values, to obtain the loss value.
The device according to claim 13, wherein the training sub-module includes:

An adjustment unit, configured to adjust parameters of a plurality of the task processing sub-models and the shared sub-model respectively according to the loss value to train the multi-task model.
An information recommendation device including:

The third acquisition module is used to input the target behavior data of the target object into the multi-task model to obtain multiple output results;

a recommendation module, configured to recommend target information to the target object based on the multiple output results,

Wherein, the multi-task model is trained according to the device according to any one of claims 9 to 15.
An electronic device including:

at least one processor; and

a memory communicatively connected to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1 to 8 Methods.
A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method according to any one of claims 1 to 8.
A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 8 are implemented.