WO2024040869A1 - Multi-task model training method, information recommendation method, apparatus, and device - Google Patents

Multi-task model training method, information recommendation method, apparatus, and device Download PDF

Info

Publication number
WO2024040869A1
WO2024040869A1 PCT/CN2023/074122 CN2023074122W WO2024040869A1 WO 2024040869 A1 WO2024040869 A1 WO 2024040869A1 CN 2023074122 W CN2023074122 W CN 2023074122W WO 2024040869 A1 WO2024040869 A1 WO 2024040869A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
model
loss
value
task
Prior art date
Application number
PCT/CN2023/074122
Other languages
French (fr)
Chinese (zh)
Inventor
王震
张文慧
吴志华
于佃海
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Publication of WO2024040869A1 publication Critical patent/WO2024040869A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, especially to the technical fields of deep learning, cloud computing, multi-task parallel processing, data search and other technical fields. More specifically, the present disclosure provides a multi-task model training method, information recommendation method, device, electronic device and storage medium.
  • deep learning models can be used to handle multiple different tasks simultaneously.
  • the learning speed of each task can be adjusted.
  • the present disclosure provides a multi-task model training method, information recommendation method, device, equipment and storage medium.
  • a training method for a multi-task model includes a plurality of task processing sub-models and a shared sub-model.
  • the method includes: inputting sample behavior data of a sample object into the shared sub-model to obtain The behavioral characteristic information of the sample object, where the behavioral characteristic information includes multiple behavioral characteristic sub-information; according to the behavioral characteristic sub-information related to the task processing sub-model, the sub-loss value of the task processing sub-model is obtained; according to the sub-loss value of the task processing sub-model Loss value, determine the target gradient value corresponding to the sub-loss value; determine the weight value corresponding to the sub-loss value based on multiple target gradient values; and determine multiple sub-loss values based on the multiple sub-loss values and multiple weight values corresponding to the multiple sub-loss values. , train a multi-task model.
  • an information recommendation method includes: inputting the target behavior data of the target object into a multi-task model to obtain multiple output results; and recommending target information to the target object based on the multiple output results. , wherein the multi-task model is trained according to the method provided by this disclosure.
  • a training device for a multi-task model includes a plurality of task processing sub-models and a shared sub-model.
  • the device includes: a first acquisition module for obtaining samples of sample objects.
  • the behavioral data is input into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes multiple behavioral characteristic sub-information; the second acquisition module is used to obtain the task based on the behavioral characteristic sub-information related to the task processing sub-model.
  • the first determination module is used to process the sub-loss values of the sub-model according to the task and determine the target gradient value corresponding to the sub-loss value
  • the second determination module is used to determine the target gradient value according to multiple target gradient values. Determine weight values corresponding to the sub-loss values
  • a training module for training a multi-task model based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
  • an information recommendation device includes: a third acquisition module for inputting the target behavior data of the target object into a multi-task model to obtain multiple output results; a recommendation module for According to multiple output results, target information is recommended to the target object, wherein the multi-task model is trained according to the device provided by the present disclosure.
  • an electronic device including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions that can be executed by the at least one processor, and the instructions are At least one processor executes to enable at least one processor to execute the method provided in accordance with the present disclosure.
  • a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a method provided according to the present disclosure.
  • a computer program product including a computer program that, when executed by a processor, implements a method provided according to the present disclosure.
  • Figure 1 is a flow chart of a training method for a multi-task model according to an embodiment of the present disclosure
  • Figure 2 is a schematic diagram of obtaining a sub-loss value of a task processing sub-model according to an embodiment of the present disclosure
  • Figure 3 is a schematic diagram of determining a target gradient value corresponding to a sub-loss value according to an embodiment of the present disclosure
  • Figure 4 is a flow chart of a training method for a multi-task model according to one embodiment of the present disclosure
  • Figure 5 is a flow chart of an information recommendation method according to another embodiment of the present disclosure.
  • Figure 6 is a block diagram of a training device for a multi-task model according to an embodiment of the present disclosure
  • Figure 7 is a block diagram of an information recommendation device according to another embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device that can apply a multi-task model training method and/or an information recommendation method according to one embodiment of the present disclosure.
  • the collection, storage, use, processing, transmission, provision, disclosure and application of user personal information are in compliance with relevant laws and regulations, necessary confidentiality measures have been taken, and do not violate public order and good customs .
  • the user's authorization or consent is obtained before obtaining or collecting the user's personal information.
  • Deep learning frameworks can provide developers with relevant model interfaces.
  • the deep learning framework can also provide auxiliary tools for analyzing scenario problems from the developer's perspective to improve development efficiency and development experience.
  • Deep learning frameworks may include, for example, PaddlePaddle, Tensorflow, and so on.
  • the Flying Paddle framework can support deep learning models and machine learning models.
  • the learning forms of the model can include supervised learning and unsupervised learning, etc.
  • Supervised learning can include, for example, multi-task learning.
  • the task of optimizing more than one objective function can be called multi-task learning.
  • the counterpart to multi-task learning is single-task learning.
  • single-task learning the optimization of an objective function can be studied. In order to achieve multi-task learning, it is necessary to process data from multiple tasks and balance the learning speed between multiple tasks.
  • Multi-task learning has a wide range of applications and can be applied to natural language processing, computer vision and information recommendation and other technical fields.
  • the behavioral data of an object may involve multiple tasks. For example, for short videos, determining the video completion rate, determining the click rate, determining the attention rate, determining the like rate, etc. can be regarded as one task respectively.
  • multi-task learning can be implemented by a multi-task model.
  • a certain balance can be maintained between different tasks so that the parameters learned by the multi-task model have certain applicability to each task.
  • the weight of the tasks can be set.
  • the gradient changes of different sub-models used to perform multiple tasks are difficult to represent with a constant. Even if the gradient changes of some sub-models can be represented by constants, multiple sets of experiments need to be set up for one sub-model for debugging. Therefore, the time cost of training multi-task models is relatively high.
  • Figure 1 is a flow chart of a training method for a multi-task model according to an embodiment of the present disclosure.
  • the method 100 may include operations S110 to S150.
  • the multi-tasking model includes multiple task processing sub-models and sharing sub-models.
  • each task processing submodel is used to process one task.
  • the shared submodel can be used to extract features of sample data.
  • the sample behavior data of the sample object is input into the shared sub-model to obtain the behavior characteristic information of the sample object.
  • the sample behavior data may be the behavior data of the sample object within a sample period.
  • the historical behavior performed by the sample object on one or more historical information can be collected.
  • the historical information may be, for example, short videos that the sample subject has watched.
  • These historical behaviors may include, for example, clicks, followings, likes, comments, and other behaviors.
  • the behavioral characteristic information includes a plurality of behavioral characteristic sub-information.
  • a behavioral characteristic sub-information can correspond to a behavior.
  • a sub-loss value of the task processing sub-model is obtained based on the behavioral characteristic sub-information related to the task processing sub-model.
  • the task processing sub-model can be used to process the behavioral characteristic sub-information to obtain the output result. Based on the output results, determine the sub-loss value of the task processing sub-model.
  • the output result can indicate the probability that the sample object performs an action on the sample information.
  • the sample information may be a short video that the sample subject has not watched.
  • various loss functions can be used to determine the sub-loss value of the task processing sub-model based on the output results.
  • a target gradient value corresponding to the sub-loss value is determined according to the sub-loss value of the task processing sub-model.
  • the target gradient value may be a gradient value of the shared sub-model.
  • the task processing sub-model may include at least one task processing layer.
  • the shared submodel may include at least one shared layer
  • At least one gradient value of the task processing sub-model can be determined, or at least one gradient value of the shared sub-model can be determined.
  • the target gradient value corresponding to the sub-loss value may be determined based on at least one gradient value of the shared sub-model.
  • any one of the at least one gradient value of the shared sub-model can be used as the target gradient value.
  • a weight value corresponding to the sub-loss value is determined according to the plurality of target gradient values.
  • the weight value corresponding to the sub-loss value may be determined.
  • various operations can be performed based on multiple target gradient values to obtain the operation results of the gradient values.
  • Various operations may include, for example, summation, multiplication, and the like.
  • the weight value corresponding to the sub-loss value can be obtained.
  • a multi-task model is trained according to a plurality of sub-loss values and a plurality of weight values respectively corresponding to the plurality of sub-loss values.
  • a loss value can be obtained based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values. Based on the loss value, a multi-task model can be trained.
  • various operations can be performed based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values to obtain the loss value.
  • various operations may include weighted sums, weighted averages, and the like.
  • a multi-task model can be trained based on the loss value.
  • the final loss value is determined based on multiple target gradient values. Pay attention to It eliminates the difference in learning speed between different tasks and can effectively balance the learning speed of different tasks, making the parameters learned by the multi-task model have certain applicability to different tasks.
  • the multi-task model obtained thus has stronger multi-task parallel processing capabilities, improves data processing efficiency when hardware resources are preset or limited, and can recommend more accurate information based on the object's behavioral data.
  • Figure 2 is a schematic diagram of obtaining sub-loss values of a task processing sub-model according to one embodiment of the present disclosure.
  • the multi-task model may include a shared sub-model 210 and n task processing sub-models.
  • n is an integer greater than 1.
  • the n task processing sub-models may include, for example, the first task processing sub-model 221, the second task processing sub-model 222, ..., the n-th task processing sub-model 223.
  • n can be 3.
  • the sample behavior data of the sample object can be input into the shared sub-model to obtain the behavioral characteristic information of the sample object.
  • Behavioral characteristic information may include n behavioral characteristic sub-information.
  • shared submodel 210 may include multiple shared layers.
  • the multiple shared layers can be the first shared layer, the second shared layer,..., the m-th shared layer, where m is an integer greater than 1.
  • the first shared layer can process sample behavioral data and obtain the first initial behavioral characteristic information.
  • the second shared layer can process the first initial behavioral characteristic information and obtain the second initial behavioral characteristic information.
  • the m-th shared layer can process the m-1 initial behavioral feature information and obtain the m-th initial behavioral feature information.
  • the m-th initial behavioral characteristic information can be used as the behavioral characteristic information of the sample object.
  • the n behavioral characteristic sub-information may include the first behavioral characteristic sub-information, the second behavioral characteristic sub-information, ..., the n-th behavioral characteristic sub-information.
  • the sub-loss value of the task processing sub-model can be obtained.
  • the behavioral characteristic sub-information related to the task processing sub-model can be input into the task processing sub-model to obtain the output result of the task processing sub-model. According to the output results, The sub-loss value of the task processing sub-model can be obtained.
  • the first behavioral characteristic sub-information can be input into the first task processing sub-model 221 to obtain the first output result 2211.
  • the first output result 2211 may indicate the predicted probability of the sample object performing the first behavior.
  • various loss functions can be used to determine the first sub-loss value 2212.
  • the first sub-label can indicate the true probability of the sample object performing the first behavior.
  • Various loss functions may include, for example, Cross Entropy (CE) loss function, L1 loss function, etc.
  • the second behavioral characteristic sub-information can be input into the second task processing sub-model 222 to obtain the second output result 2221.
  • the second output result 2221 may indicate the predicted probability of the sample object performing the second behavior.
  • various loss functions can be used to determine the second sub-loss value 2222.
  • the second sub-label can indicate the true probability of the sample object performing the second behavior.
  • the n-th behavioral characteristic sub-information can be input into the n-th task processing sub-model 223 to obtain the n-th output result 2231.
  • the nth output result 2231 may indicate the predicted probability of the sample object performing the nth behavior.
  • various loss functions are used to determine the n-th sub-loss value 2232 based on the n-th output result 2231 and the n-th sub-label of the sample behavior data.
  • the nth sub-label can indicate the true probability of the sample object performing the nth behavior.
  • manual annotation can be performed on a piece of sample information and sample behavior data 201 to obtain a label for the sample information.
  • the tag can include n sub-tags.
  • the n sub-tags may include, for example, the above-mentioned first sub-tag, second sub-tag, ..., n-th sub-tag.
  • FIG. 3 is a schematic diagram of determining a target gradient value corresponding to a sub-loss value according to an embodiment of the present disclosure.
  • the multi-task model may include a shared sub-model 310 and n task processing sub-models.
  • the n task processing sub-models may include, for example, the first task processing sub-model 321, the second task processing sub-model 322, ..., and the n-th task processing sub-model 323.
  • n can be 3.
  • Shared sub-model 310 may also include multiple shared layers.
  • the multiple shared layers can be the first shared layer, the second shared layer,..., the m-th shared layer, where m is an integer greater than 1.
  • the first shared layer can process sample behavioral data and obtain the first initial behavioral characteristic information.
  • the second shared layer can process the first initial behavioral characteristic information and obtain the second initial behavioral characteristic information.
  • the m-th shared layer can process the m-1 initial behavioral feature information and obtain the m-th initial behavioral feature information.
  • the m-th initial behavioral characteristic information can be used as the behavioral characteristic information of the sample object.
  • the target gradient value corresponding to the loss value may be determined. It can be understood that the above detailed description of the first sub-loss value 2212, the second sub-loss value 2222, ..., the n-th sub-loss value 2232 can also be applied to this embodiment, and the present disclosure will not be repeated here.
  • the gradient value of the first task processing sub-model 321 can be determined.
  • the m first gradient values of the shared sub-model 310 can be determined.
  • the first gradient value of the first shared layer can be used as the first target gradient value 3213. It can be understood that in the process of determining the first gradient value of the first shared layer, the relevant parameters of the first shared layer can be used.
  • the gradient value of the second task processing sub-model 322 can be determined based on the second sub-loss value 3222 and the parameters of the second task processing sub-model 322.
  • m second gradient values of the shared sub-model 310 can be determined.
  • the second gradient value of the first shared layer can be used as the second target gradient value 3223. It can be understood that in the process of determining the second gradient value of the first shared layer, the relevant parameters of the first shared layer can be used.
  • the gradient value of the n-th task processing sub-model 323 can be determined according to the n-th sub-loss value 3232 and the parameters of the n-th task processing sub-model 323.
  • the m nth gradient values of the shared sub-model 310 can be determined.
  • the nth gradient value of the first shared layer can be used as the nth target gradient value 3233. It can be understood that when determining the nth gradient value of the first shared layer In the process, you can use the relevant parameters of the first shared layer.
  • the first shared layer can process sample behavioral data.
  • the first shared layer can be the last shared layer of multiple shared layers.
  • any first gradient value among the m first gradient values may also be used as the first target gradient value. It is also possible to use any second gradient value among the m second gradient values as the second target gradient value. ...You can also use any n-th gradient value among the m n-th gradient values as the n-th target gradient value.
  • Figure 4 is a flowchart of a training method for a multi-task model according to one embodiment of the present disclosure.
  • the sample behavior data of the sample object can be input into the shared sub-model to obtain the behavioral characteristic information of the sample object.
  • behavioral characteristic information includes multiple behavioral characteristic sub-information.
  • the sub-loss value of the task processing sub-model is obtained according to the behavioral characteristic sub-information related to the task processing sub-model.
  • the first sub-loss value loss_1 can be obtained.
  • the first sub-loss value loss_1 can be the sub-loss value of the first task processing sub-model.
  • the second sub-loss value loss_2 can be obtained.
  • the second sub-loss value loss_2 can be the sub-loss value of the second task processing sub-model.
  • the nth sub-loss value loss_n can be obtained.
  • the n-th sub-loss value loss_n may be the sub-loss value of the n-th task processing sub-model.
  • operations S430, S440, and S451 may be performed.
  • a target gradient value corresponding to the sub-loss value is determined.
  • the first target gradient value grad_1 corresponding to the first sub-loss value loss_1 can be determined based on the first sub-loss value loss_1.
  • the second target gradient value grad_2 corresponding to the second sub-loss value loss_2 can be determined based on the second sub-loss value loss_2.
  • the n-th target gradient value grad_n corresponding to the n-th sub-loss value loss_n can be determined based on the n-th sub-loss value loss_n.
  • the relevant parameters para_last_shared_layer of the first shared layer of the shared sub-model can be used in the process of determining the first target gradient value grad_1, the second target gradient value grad_2, ..., and the nth target gradient value grad_n.
  • the weight value corresponding to the sub-loss value can be determined according to multiple target gradient values.
  • the processing parameter values can be determined based on multiple target gradient values.
  • the processing parameter value can be determined based on the first target gradient value grad_1, the second target gradient value grad_2, ..., and the nth target gradient value grad_n.
  • the weight value corresponding to the sub-loss value may be determined.
  • the processed target gradient value can be obtained.
  • the processed target gradient value can be normalized to obtain the normalized gradient value.
  • the weight value corresponding to the sub-loss value can be determined.
  • the weight value of the task can be smaller through the embodiments of the present disclosure. In the final loss value, the proportion of the task decreases, which can effectively balance the learning speed between different tasks.
  • the i-th weight value w_i corresponding to the i-th sub-loss value loss_i can be determined by the following formula.
  • i can be an integer greater than or equal to 1, and i can be an integer less than or equal to n.
  • ⁇ i exp(grad_i) can be a processing parameter value.
  • exp(grad_i) can be the i-th processed target gradient value.
  • the first weight value w_1, the second weight value w_2,..., and the nth weight value W_n can be determined.
  • the first target gradient value grad_1 is processed to obtain the first A processed target gradient value exp(grad_1).
  • the processing parameter value ⁇ i exp(grad_i) normalize the first processed target gradient value exp(grad_1) to get the first normalized gradient value exp(grad_1)/ ⁇ i exp(grad_i) .
  • the reciprocal of the first normalized gradient value can be used as the first weight value w_1.
  • the loss value can be obtained based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
  • the loss value can be obtained by performing a weighted sum based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
  • the first sub-loss value loss_1, the second sub-loss value loss_2,..., the n-th sub-loss value loss_n and the first weight value w_1, the second weight value w_2,..., the n-th weight value can be used w_n, perform weighted summation to obtain the loss value Loss.
  • a multi-task model can be trained according to the loss value.
  • the parameters of multiple task processing sub-models and shared sub-models can be adjusted separately according to the loss value to train a multi-task model.
  • the loss value can be used to adjust parameters of multiple task processing sub-models and shared sub-models.
  • Figure 5 is a flow chart of an information recommendation method according to another embodiment of the present disclosure.
  • the method 500 may include operations S510 to S520.
  • the target behavior data of the target object is input into the multi-task model to obtain multiple output results.
  • the multi-task model may be trained according to the method provided by the present disclosure.
  • a multi-task model can be trained according to method 100.
  • the output result may indicate the probability that the target object performs an action on a piece of candidate information.
  • the candidate information can be short video, image or text information.
  • an output result may indicate the probability that the target object performs a "click" action on the candidate information.
  • another output result may indicate the probability that the target object performs a "comment” action on the candidate information.
  • target information is recommended to the target object according to the plurality of output results.
  • recommended parameters of candidate information are determined based on multiple output results.
  • the target information is determined from the plurality of candidate information according to the recommended parameters of the plurality of candidate information. Recommend targeted information to target audiences.
  • the output can be normalized to a value greater than 0 and less than 1.
  • various operations can be performed to obtain recommended parameters for a candidate information.
  • various operations may include, for example, averaging, summing, weighted sums, and the like.
  • the candidate information with the largest recommendation parameter can be used as the target information and recommended to the target object.
  • information can be accurately recommended to target objects, information push efficiency can be improved, and user experience can be improved.
  • candidate information may be determined from a plurality of initial information based on target behavior data of the target object.
  • target behavior data can be converted into a target vector. Calculate the similarity between the target vector and the feature vector of the initial information. When the similarity is greater than the preset similarity threshold, the initial information corresponding to the similarity can be used as a candidate information.
  • multiple candidate information may be determined.
  • the application method of the multi-task model provided by the present disclosure in the field of information recommendation is described in detail above, but the present disclosure is not limited thereto.
  • the multi-task model provided by this disclosure can also be applied to other fields (such as image processing, text processing, audio processing, etc.).
  • Figure 6 is a block diagram of a training device for a multi-task model according to an embodiment of the present disclosure.
  • the multi-tasking model may include multiple task processing sub-models and sharing sub-models.
  • the device 600 may include a first obtaining module 610 , a second obtaining module 620 , a first determining module 630 , a second determining module 640 and a training module 650 .
  • the first obtaining module 610 is used to input the sample behavior data of the sample object into the shared sub-model, Obtain the behavioral characteristic information of the sample object.
  • behavioral characteristic information includes multiple behavioral characteristic sub-information.
  • the second obtaining module 620 is used to obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model.
  • the first determination module 630 is configured to determine the target gradient value corresponding to the sub-loss value according to the sub-loss value of the task processing sub-model.
  • the second determination module 640 is used to determine the weight value corresponding to the sub-loss value according to multiple target gradient values.
  • the training module 650 is used to train a multi-task model based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
  • the second determination module includes: a first determination sub-module, used to determine the processing parameter value according to the plurality of target gradient values; and a second determination sub-module, used to determine the processing parameter value and the sub-loss value according to the processing parameter value and the sub-loss value.
  • the corresponding target gradient value determines the weight value corresponding to the sub-loss value.
  • the second determination sub-module includes: a processing unit, used to process the target gradient value corresponding to the sub-loss value, to obtain the processed target gradient value; a normalization processing unit, used to process the target gradient value according to the processing parameter value , normalize the processed target gradient value to obtain a normalized gradient value; and a determination unit used to determine the weight value corresponding to the sub-loss value based on the reciprocal of the normalized gradient value.
  • the second obtaining module includes: a first obtaining sub-module, used to input the behavioral characteristic sub-information related to the task processing sub-model into the task processing sub-model, and obtain the output result of the task processing sub-model; and a second obtaining sub-module. Obtain sub-module, which is used to obtain the sub-loss value of the task processing sub-model based on the output results.
  • the training module includes: a third obtaining sub-module, used to obtain a loss value according to multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values; and a training sub-module, used to obtain a loss value according to the loss value to train a multi-task model.
  • the third obtaining sub-module includes: a weighted summation unit, configured to perform weighted summation based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values, to obtain a loss value.
  • the training sub-module includes: an adjustment unit, configured to respectively adjust parameters of multiple task processing sub-models and shared sub-models according to the loss value to train the multi-task model.
  • FIG. 7 is a block diagram of an information recommendation device according to another embodiment of the present disclosure.
  • the device 700 may include a third obtaining module 710 and a recommendation module 720 .
  • the third obtaining module 710 is used to input the target behavior data of the target object into the multi-task model to obtain multiple output results.
  • the recommendation module 720 is used to recommend target information to target objects based on multiple output results.
  • the multi-task model is trained according to the device provided by the present disclosure.
  • the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
  • an electronic device includes: at least one processor; and a memory communicatively connected with the at least one processor.
  • the memory stores instructions executable by at least one processor, and the instructions are executed by at least one processor, so that the at least one processor can perform the method provided according to the present disclosure.
  • the readable storage medium stores computer instructions
  • the readable storage medium may be a non-transitory computer-readable storage medium.
  • computer instructions may cause a computer to perform methods provided in accordance with the present disclosure.
  • a computer program product includes a computer program that, when executed by a processor, implements a method provided in accordance with the present disclosure. Detailed description will be given below with reference to Figure 8 .
  • FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure.
  • Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the device 800 includes a computing unit 801 that can execute according to a computer program stored in a read-only memory (ROM) 802 or loaded from a storage unit 808 into a random access memory (RAM) 803 Various appropriate actions and treatments. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored.
  • Computing unit 801, ROM 802 and RAM 803 are connected to each other via bus 804.
  • Input/output (I/O) Interface 805 is also connected to bus 804.
  • the I/O interface 805 includes: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, optical disk, etc. ; and communication unit 809, such as a network card, modem, wireless communication transceiver, etc.
  • the communication unit 809 allows the device 800 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.
  • Computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processing processor (DSP), and any appropriate processor, controller, microcontroller, etc.
  • the computing unit 801 performs various methods and processes described above, such as a multi-task model training method and/or an information recommendation method.
  • the multi-task model training method and/or the information recommendation method may be implemented as a computer software program, which is tangibly included in a machine-readable medium, such as the storage unit 808.
  • part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809.
  • the computer program When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the training method of the multi-task model and/or the information recommendation method described above may be performed.
  • the computing unit 801 may be configured to perform the multi-task model training method and/or the information recommendation method in any other suitable manner (eg, by means of firmware).
  • Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip implemented in a system (SOC), complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC system
  • CPLD complex programmable logic device
  • computer hardware firmware, software, and/or combinations thereof.
  • These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor
  • the processor which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • An output device may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • An output device may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented.
  • the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM portable compact disk read-only memory
  • magnetic storage device or any suitable combination of the above.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (Cathode Ray Tube) display or an LCD (Liquid Crystal Display)) for displaying information to the user. ; and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (Cathode Ray Tube) display or an LCD (Liquid Crystal Display)
  • a keyboard and pointing device eg, a mouse or a trackball
  • Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system.
  • Digital data communications e.g., communications networks
  • Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.
  • Computer systems may include clients and servers.
  • Clients and servers are generally remote from each other and typically interact over a communications network.
  • the relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.
  • the server can be a cloud server, a distributed system server, or a server combined with a blockchain.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a multi-task model training method, relating to the technical field of artificial intelligence, and in particular, to the technical fields of deep learning, cloud computing, multi-task parallel processing, and data searching. A specific implementation solution comprises: inputting sample behavior data of a sample object into a shared sub-model, to obtain behavior characteristic information of the sample object, the behavior characteristic information comprising multiple pieces of behavior characteristic sub-information; obtaining a sub-loss value of a task processing sub-model according to the behavior characteristic sub-information associated with the task processing sub-model; determining a target gradient value corresponding to the sub-loss value, according to the sub-loss value of the task processing sub-model; determining a weight value corresponding to the sub-loss value, according to multiple target gradient values; and training a multi-task model according to multiple sub-loss values and the multiple weight values corresponding to the multiple sub-loss values. The present disclosure further provides an information recommendation method, an apparatus, an electronic device, and a storage medium.

Description

多任务模型的训练方法、信息推荐方法、装置和设备Training methods, information recommendation methods, devices and equipment for multi-task models
本申请要求于2022年8月24日递交的中国专利申请No.202211015938.1的优先权,其内容一并在此作为参考。This application claims priority from Chinese Patent Application No. 202211015938.1 submitted on August 24, 2022, the content of which is hereby incorporated by reference.
技术领域Technical field
本公开涉及人工智能技术领域,尤其涉及深度学习、云计算、多任务并行处理和数据搜索等技术领域。更具体地,本公开提供了一种多任务模型的训练方法、信息推荐方法、装置、电子设备和存储介质。The present disclosure relates to the technical field of artificial intelligence, especially to the technical fields of deep learning, cloud computing, multi-task parallel processing, data search and other technical fields. More specifically, the present disclosure provides a multi-task model training method, information recommendation method, device, electronic device and storage medium.
背景技术Background technique
随着人工智能技术的发展,深度学习模型可以用于同时处理多个不同的任务。在训练过程中,为了使得深度学习模型学习到的参数对每个任务都有一定的适用性,可以调整各任务的学习速度。With the development of artificial intelligence technology, deep learning models can be used to handle multiple different tasks simultaneously. During the training process, in order to make the parameters learned by the deep learning model have certain applicability to each task, the learning speed of each task can be adjusted.
发明内容Contents of the invention
本公开提供了一种多任务模型的训练方法、信息推荐方法、装置、设备以及存储介质。The present disclosure provides a multi-task model training method, information recommendation method, device, equipment and storage medium.
根据本公开的一方面,提供了一种多任务模型的训练方法,多任务模型包括多个任务处理子模型和共享子模型,该方法包括:将样本对象的样本行为数据输入共享子模型,得到样本对象的行为特征信息,其中,行为特征信息包括多个行为特征子信息;根据与任务处理子模型相关的行为特征子信息,得到任务处理子模型的子损失值;根据任务处理子模型的子损失值,确定与子损失值对应的目标梯度值;根据多个目标梯度值,确定与子损失值对应的权重值;以及根据多个子损失值以及分别与多个子损失值对应的多个权重值,训练多任务模型。According to one aspect of the present disclosure, a training method for a multi-task model is provided. The multi-task model includes a plurality of task processing sub-models and a shared sub-model. The method includes: inputting sample behavior data of a sample object into the shared sub-model to obtain The behavioral characteristic information of the sample object, where the behavioral characteristic information includes multiple behavioral characteristic sub-information; according to the behavioral characteristic sub-information related to the task processing sub-model, the sub-loss value of the task processing sub-model is obtained; according to the sub-loss value of the task processing sub-model Loss value, determine the target gradient value corresponding to the sub-loss value; determine the weight value corresponding to the sub-loss value based on multiple target gradient values; and determine multiple sub-loss values based on the multiple sub-loss values and multiple weight values corresponding to the multiple sub-loss values. , train a multi-task model.
根据本公开的另一方面,提供了一种信息推荐方法,该方法包括:将目标对象的目标行为数据输入多任务模型,得到多个输出结果;根据多个输出结果,向目标对象推荐目标信息,其中,多任务模型是根据本公开提供的方法训练的。 According to another aspect of the present disclosure, an information recommendation method is provided. The method includes: inputting the target behavior data of the target object into a multi-task model to obtain multiple output results; and recommending target information to the target object based on the multiple output results. , wherein the multi-task model is trained according to the method provided by this disclosure.
根据本公开的另一方面,提供了一种多任务模型的训练装置,多任务模型包括多个任务处理子模型和共享子模型,该装置包括:第一获得模块,用于将样本对象的样本行为数据输入共享子模型,得到样本对象的行为特征信息,其中,行为特征信息包括多个行为特征子信息;第二获得模块,用于根据与任务处理子模型相关的行为特征子信息,得到任务处理子模型的子损失值;第一确定模块,用于根据任务处理子模型的子损失值,确定与子损失值对应的目标梯度值;第二确定模块,用于根据多个目标梯度值,确定与子损失值对应的权重值;以及训练模块,用于根据多个子损失值以及分别与多个子损失值对应的多个权重值,训练多任务模型。According to another aspect of the present disclosure, a training device for a multi-task model is provided. The multi-task model includes a plurality of task processing sub-models and a shared sub-model. The device includes: a first acquisition module for obtaining samples of sample objects. The behavioral data is input into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes multiple behavioral characteristic sub-information; the second acquisition module is used to obtain the task based on the behavioral characteristic sub-information related to the task processing sub-model. Process the sub-loss values of the sub-model; the first determination module is used to process the sub-loss values of the sub-model according to the task and determine the target gradient value corresponding to the sub-loss value; the second determination module is used to determine the target gradient value according to multiple target gradient values. Determine weight values corresponding to the sub-loss values; and a training module for training a multi-task model based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
根据本公开的另一方面,提供了一种信息推荐装置,该装置包括:第三获得模块,用于将目标对象的目标行为数据输入多任务模型,得到多个输出结果;推荐模块,用于根据多个输出结果,向目标对象推荐目标信息,其中,多任务模型是根据本公开提供的装置训练的。According to another aspect of the present disclosure, an information recommendation device is provided. The device includes: a third acquisition module for inputting the target behavior data of the target object into a multi-task model to obtain multiple output results; a recommendation module for According to multiple output results, target information is recommended to the target object, wherein the multi-task model is trained according to the device provided by the present disclosure.
根据本公开的另一方面,提供了一种电子设备,包括:至少一个处理器;以及与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行根据本公开提供的方法。According to another aspect of the present disclosure, an electronic device is provided, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions that can be executed by the at least one processor, and the instructions are At least one processor executes to enable at least one processor to execute the method provided in accordance with the present disclosure.
根据本公开的另一方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,该计算机指令用于使计算机执行根据本公开提供的方法。According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a method provided according to the present disclosure.
根据本公开的另一方面,提供了一种计算机程序产品,包括计算机程序,该计算机程序在被处理器执行时实现根据本公开提供的方法。According to another aspect of the present disclosure, a computer program product is provided, including a computer program that, when executed by a processor, implements a method provided according to the present disclosure.
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.
附图说明Description of drawings
附图用于更好地理解本方案,不构成对本公开的限定。其中:The accompanying drawings are used to better understand the present solution and do not constitute a limitation of the present disclosure. in:
图1是根据本公开的一个实施例的多任务模型的训练方法的流程图;Figure 1 is a flow chart of a training method for a multi-task model according to an embodiment of the present disclosure;
图2是根据本公开的一个实施例的获得任务处理子模型的子损失值的示意图; Figure 2 is a schematic diagram of obtaining a sub-loss value of a task processing sub-model according to an embodiment of the present disclosure;
图3是根据本公开的一个实施例的确定与子损失值对应的目标梯度值的示意图;Figure 3 is a schematic diagram of determining a target gradient value corresponding to a sub-loss value according to an embodiment of the present disclosure;
图4是根据本公开的一个实施例的多任务模型的训练方法的流程图;Figure 4 is a flow chart of a training method for a multi-task model according to one embodiment of the present disclosure;
图5是根据本公开的另一个实施例的信息推荐方法的流程图;Figure 5 is a flow chart of an information recommendation method according to another embodiment of the present disclosure;
图6是根据本公开的一个实施例的多任务模型的训练装置的框图;Figure 6 is a block diagram of a training device for a multi-task model according to an embodiment of the present disclosure;
图7是根据本公开的另一个实施例的信息推荐装置的框图;以及Figure 7 is a block diagram of an information recommendation device according to another embodiment of the present disclosure; and
图8是根据本公开的一个实施例的可以应用多任务模型的训练方法和/或信息推荐方法的电子设备的框图。FIG. 8 is a block diagram of an electronic device that can apply a multi-task model training method and/or an information recommendation method according to one embodiment of the present disclosure.
具体实施方式Detailed ways
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding and should be considered to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.
本公开的技术方案中,所涉及的用户个人信息的收集、存储、使用、加工、传输、提供、公开和应用等处理,均符合相关法律法规的规定,采取了必要保密措施,且不违背公序良俗。在本公开的技术方案中,在获取或采集用户个人信息之前,均获取了用户的授权或同意。In the technical solution of this disclosure, the collection, storage, use, processing, transmission, provision, disclosure and application of user personal information are in compliance with relevant laws and regulations, necessary confidentiality measures have been taken, and do not violate public order and good customs . In the technical solution of the present disclosure, the user's authorization or consent is obtained before obtaining or collecting the user's personal information.
深度学习框架可以为开发者提供相关的模型接口。深度学习框架也可以从开发者的角度出发,提供用于分析场景问题的辅助工具,以便提高开发效率及开发体验。深度学习框架例如可以包括飞桨(PaddlePaddle)、张量流(Tensorflow)等等。在这些框架中,例如飞桨框架可以支持深度学习模型和机器学习模型。Deep learning frameworks can provide developers with relevant model interfaces. The deep learning framework can also provide auxiliary tools for analyzing scenario problems from the developer's perspective to improve development efficiency and development experience. Deep learning frameworks may include, for example, PaddlePaddle, Tensorflow, and so on. Among these frameworks, for example, the Flying Paddle framework can support deep learning models and machine learning models.
对于深度学习模型,模型的学习形式可以包括有监督学习和无监督学习等等。有监督学习例如可以包括多任务学习。可以将优化多于一个目标函数的任务称为多任务学习。与多任务学习对应的是单任务学习。对于单任务学习,可以研究一个目标函数的优化。为了实现多任务学习,需要处理多个任务的数据,以及平衡多个任务之间学习速度。For deep learning models, the learning forms of the model can include supervised learning and unsupervised learning, etc. Supervised learning can include, for example, multi-task learning. The task of optimizing more than one objective function can be called multi-task learning. The counterpart to multi-task learning is single-task learning. For single-task learning, the optimization of an objective function can be studied. In order to achieve multi-task learning, it is necessary to process data from multiple tasks and balance the learning speed between multiple tasks.
多任务学习的应用范围较广,可以应用于自然语言处理、计算机视觉 和信息推荐等技术领域。在一些实施例中,以多任务学习在信息推荐领域的应用为示例,对象的行为数据可以涉及多个任务。例如,对于短视频,确定视频完播率、确定点击率、确定关注率、确定点赞率等可以分别作为一个任务。Multi-task learning has a wide range of applications and can be applied to natural language processing, computer vision and information recommendation and other technical fields. In some embodiments, taking the application of multi-task learning in the field of information recommendation as an example, the behavioral data of an object may involve multiple tasks. For example, for short videos, determining the video completion rate, determining the click rate, determining the attention rate, determining the like rate, etc. can be regarded as one task respectively.
在一些实施例中,多任务学习可以由多任务模型实现。在训练多任务模型的过程中,不同的任务之间可以保持一定的平衡,以便多任务模型学习到的参数对每个任务都有一定的适用性。例如,为了平衡不同任务的学习速度,可以设置任务的权重。然而,在训练过程中,用于执行多个任务的不同子模型的梯度变化难以用一个常量来表示。即使有些子模型的梯度变化可以用常量来表示,也需要针对一个子模型设置多组实验来进行调试。由此,训练多任务模型的时间成本较高。In some embodiments, multi-task learning can be implemented by a multi-task model. In the process of training a multi-task model, a certain balance can be maintained between different tasks so that the parameters learned by the multi-task model have certain applicability to each task. For example, in order to balance the learning speed of different tasks, the weight of the tasks can be set. However, during the training process, the gradient changes of different sub-models used to perform multiple tasks are difficult to represent with a constant. Even if the gradient changes of some sub-models can be represented by constants, multiple sets of experiments need to be set up for one sub-model for debugging. Therefore, the time cost of training multi-task models is relatively high.
图1是根据本公开的一个实施例的多任务模型的训练方法的流程图。Figure 1 is a flow chart of a training method for a multi-task model according to an embodiment of the present disclosure.
如图1所示,该方法100可以包括操作S110至操作S150。As shown in FIG. 1 , the method 100 may include operations S110 to S150.
在本公开实施例中,多任务模型包括多个任务处理子模型和共享子模型。例如,每个任务处理子模型用于处理一个任务。又例如,共享子模型可以用于提取样本数据的特征。In embodiments of the present disclosure, the multi-tasking model includes multiple task processing sub-models and sharing sub-models. For example, each task processing submodel is used to process one task. As another example, the shared submodel can be used to extract features of sample data.
在操作S110,将样本对象的样本行为数据输入共享子模型,得到样本对象的行为特征信息。In operation S110, the sample behavior data of the sample object is input into the shared sub-model to obtain the behavior characteristic information of the sample object.
在本公开实施例中,样本行为数据可以是样本对象在一个样本时段内的行为数据。例如,在样本时段内,可以采集样本对象对一个或多个历史信息执行的历史行为。在一个示例中,历史信息例如可以是样本对象已经观看过的短视频,这些历史行为例如可以包括点击、关注、点赞、评论等行为。In the embodiment of the present disclosure, the sample behavior data may be the behavior data of the sample object within a sample period. For example, within the sample period, the historical behavior performed by the sample object on one or more historical information can be collected. In one example, the historical information may be, for example, short videos that the sample subject has watched. These historical behaviors may include, for example, clicks, followings, likes, comments, and other behaviors.
在本公开实施例中,行为特征信息包括多个行为特征子信息。In the embodiment of the present disclosure, the behavioral characteristic information includes a plurality of behavioral characteristic sub-information.
例如,一个行为特征子信息可以与一个行为对应。For example, a behavioral characteristic sub-information can correspond to a behavior.
在操作S120,根据与任务处理子模型相关的行为特征子信息,得到任务处理子模型的子损失值。In operation S120, a sub-loss value of the task processing sub-model is obtained based on the behavioral characteristic sub-information related to the task processing sub-model.
在本公开实施例中,可以利用任务处理子模型处理行为特征子信息,得到输出结果。根据输出结果,确定任务处理子模型的子损失值。In the embodiment of the present disclosure, the task processing sub-model can be used to process the behavioral characteristic sub-information to obtain the output result. Based on the output results, determine the sub-loss value of the task processing sub-model.
例如,输出结果可以指示样本对象对样本信息执行一个行为的概率。 在一个示例中,样本信息可以是样本对象未观看过的短视频。For example, the output result can indicate the probability that the sample object performs an action on the sample information. In one example, the sample information may be a short video that the sample subject has not watched.
又例如,基于有监督学习、半监督学习或无监督学习的方式,根据输出结果,可以利用各种损失函数确定任务处理子模型的子损失值。For another example, based on supervised learning, semi-supervised learning or unsupervised learning, various loss functions can be used to determine the sub-loss value of the task processing sub-model based on the output results.
在操作S130,根据任务处理子模型的子损失值,确定与子损失值对应的目标梯度值。In operation S130, a target gradient value corresponding to the sub-loss value is determined according to the sub-loss value of the task processing sub-model.
在本公开实施例中,目标梯度值可以为共享子模型的一个梯度值。In the embodiment of the present disclosure, the target gradient value may be a gradient value of the shared sub-model.
例如,任务处理子模型可以包括至少一个任务处理层。又例如,共享子模型可以包括至少一个共享层For example, the task processing sub-model may include at least one task processing layer. As another example, the shared submodel may include at least one shared layer
例如,根据任务处理子模型的子损失值,可以确定任务处理子模型的至少一个梯度值,也可以确定共享子模型的至少一个梯度值。可以根据共享子模型的至少一个梯度值,确定与子损失值对应的目标梯度值。For example, according to the sub-loss value of the task processing sub-model, at least one gradient value of the task processing sub-model can be determined, or at least one gradient value of the shared sub-model can be determined. The target gradient value corresponding to the sub-loss value may be determined based on at least one gradient value of the shared sub-model.
例如,可以将共享子模型的至少一个梯度值中的任一个梯度值作为目标梯度值。For example, any one of the at least one gradient value of the shared sub-model can be used as the target gradient value.
在操作S140,根据多个目标梯度值,确定与子损失值对应的权重值。In operation S140, a weight value corresponding to the sub-loss value is determined according to the plurality of target gradient values.
在本公开实施例中,根据多个目标梯度值以及与子损失值对应的目标梯度值,可以确定与子损失值对应的权重值。In embodiments of the present disclosure, according to multiple target gradient values and the target gradient value corresponding to the sub-loss value, the weight value corresponding to the sub-loss value may be determined.
例如,可以根据多个目标梯度值,进行各种运算,得到梯度值的运算结果。各种运算例如可以包括求和、相乘等等。又例如,根据梯度值的运算结果以及与子损失值对应的目标梯度值,进行各种运算,可以得到与子损失值对应的权重值。For example, various operations can be performed based on multiple target gradient values to obtain the operation results of the gradient values. Various operations may include, for example, summation, multiplication, and the like. For another example, by performing various operations based on the calculation result of the gradient value and the target gradient value corresponding to the sub-loss value, the weight value corresponding to the sub-loss value can be obtained.
在操作S150,根据多个子损失值以及分别与多个子损失值对应的多个权重值,训练多任务模型。In operation S150, a multi-task model is trained according to a plurality of sub-loss values and a plurality of weight values respectively corresponding to the plurality of sub-loss values.
在本公开实施例中,根据多个子损失值以及分别与多个子损失值对应的多个权重值,可以得到损失值。根据损失值,可以训练多任务模型。In the embodiment of the present disclosure, a loss value can be obtained based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values. Based on the loss value, a multi-task model can be trained.
例如,可以根据多个子损失值以及分别与多个子损失值对应的多个权重值,进行各种运算,得到损失值。在一个示例中,各种运算可以包括加权求和、加权平均等等。For example, various operations can be performed based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values to obtain the loss value. In one example, various operations may include weighted sums, weighted averages, and the like.
例如,基于梯度下降算法或反向传播算法,可以根据损失值训练多任务模型。For example, based on the gradient descent algorithm or the backpropagation algorithm, a multi-task model can be trained based on the loss value.
通过本公开实施例,根据多个目标梯度值确定了最终的损失值,关注 了不同任务之间学习速度的差异,可以有效地平衡不同任务的学习速度,使得多任务模型学习到的参数对不同的任务都有一定的适用性。由此获得的多任务模型具有更强的多任务并行处理能力,在硬件资源预先设定或有限的情况下,提高了数据处理效率,可以根据对象的行为数据推荐更加准确的信息。Through the embodiment of the present disclosure, the final loss value is determined based on multiple target gradient values. Pay attention to It eliminates the difference in learning speed between different tasks and can effectively balance the learning speed of different tasks, making the parameters learned by the multi-task model have certain applicability to different tasks. The multi-task model obtained thus has stronger multi-task parallel processing capabilities, improves data processing efficiency when hardware resources are preset or limited, and can recommend more accurate information based on the object's behavioral data.
可以理解,上文对本公开的多任务模型的训练方法进行了详细描述。下面将结合相关实施例及图2对本公开中获得任务处理子模型的子损失值的方式进行详细描述。It can be understood that the training method of the multi-task model of the present disclosure is described in detail above. The method of obtaining the sub-loss value of the task processing sub-model in the present disclosure will be described in detail below in conjunction with relevant embodiments and Figure 2.
图2是根据本公开的一个实施例的获得任务处理子模型的子损失值的示意图。Figure 2 is a schematic diagram of obtaining sub-loss values of a task processing sub-model according to one embodiment of the present disclosure.
如图2所示,多任务模型可以包括共享子模型210和n个任务处理子模型。n为大于1的整数。n个任务处理子模型例如可以包括第1个任务处理子模型221、第2个任务处理子模型222、……、第n个任务处理子模型223。在一个示例中,n可以为3。As shown in Figure 2, the multi-task model may include a shared sub-model 210 and n task processing sub-models. n is an integer greater than 1. The n task processing sub-models may include, for example, the first task processing sub-model 221, the second task processing sub-model 222, ..., the n-th task processing sub-model 223. In one example, n can be 3.
在本公开实施例中,可以将样本对象的样本行为数据输入共享子模型,得到样本对象的行为特征信息。行为特征信息可以包括n个行为特征子信息。In the embodiment of the present disclosure, the sample behavior data of the sample object can be input into the shared sub-model to obtain the behavioral characteristic information of the sample object. Behavioral characteristic information may include n behavioral characteristic sub-information.
例如,共享子模型210可以包括多个共享层。多个共享层可以为第1个共享层、第2个共享层、……、第m个共享层,m为大于1的整数。第1个共享层可以处理样本行为数据,得到第1个初始行为特征信息。第2个共享层可以处理第1个初始行为特征信息,得到第2个初始行为特征信息。……第m个共享层可以处理第m-1个初始行为特征信息,得到第m个初始行为特征信息。可以将第m个初始行为特征信息作为样本对象的行为特征信息。For example, shared submodel 210 may include multiple shared layers. The multiple shared layers can be the first shared layer, the second shared layer,..., the m-th shared layer, where m is an integer greater than 1. The first shared layer can process sample behavioral data and obtain the first initial behavioral characteristic information. The second shared layer can process the first initial behavioral characteristic information and obtain the second initial behavioral characteristic information. ...The m-th shared layer can process the m-1 initial behavioral feature information and obtain the m-th initial behavioral feature information. The m-th initial behavioral characteristic information can be used as the behavioral characteristic information of the sample object.
例如,n个行为特征子信息可以包括第1个行为特征子信息、第2个行为特征子信息、……、第n个行为特征子信息。For example, the n behavioral characteristic sub-information may include the first behavioral characteristic sub-information, the second behavioral characteristic sub-information, ..., the n-th behavioral characteristic sub-information.
在本公开实施例中,根据与任务处理子模型相关的行为特征子信息,可以得到任务处理子模型的子损失值。In the embodiment of the present disclosure, according to the behavioral characteristic sub-information related to the task processing sub-model, the sub-loss value of the task processing sub-model can be obtained.
在本公开实施例中,可以将与任务处理子模型相关的行为特征子信息输入任务处理子模型,得到任务处理子模型的输出结果。根据输出结果, 可以得到任务处理子模型的子损失值。In the embodiment of the present disclosure, the behavioral characteristic sub-information related to the task processing sub-model can be input into the task processing sub-model to obtain the output result of the task processing sub-model. According to the output results, The sub-loss value of the task processing sub-model can be obtained.
例如,可以将第1个行为特征子信息输入第1个任务处理子模型221,得到第1个输出结果2211。第1个输出结果2211可以指示样本对象执行第1个行为的预测概率。在一个示例中,根据第1个输出结果2211和样本行为数据的第1个子标签,可以利用各种损失函数确定第1个子损失值2212。第1个子标签可以指示样本对象执行第1个行为的真实概率。各种损失函数例如可以包括交叉熵(Cross Entropy,CE)损失函数、L1损失函数等等。For example, the first behavioral characteristic sub-information can be input into the first task processing sub-model 221 to obtain the first output result 2211. The first output result 2211 may indicate the predicted probability of the sample object performing the first behavior. In one example, based on the first output result 2211 and the first sub-label of the sample behavior data, various loss functions can be used to determine the first sub-loss value 2212. The first sub-label can indicate the true probability of the sample object performing the first behavior. Various loss functions may include, for example, Cross Entropy (CE) loss function, L1 loss function, etc.
例如,可以将第2个行为特征子信息输入第2个任务处理子模型222,得到第2个输出结果2221。第2个输出结果2221可以指示样本对象执行第2个行为的预测概率。在一个示例中,根据第2个输出结果2221和样本行为数据的第2个子标签,可以利用各种损失函数确定第2个子损失值2222。第2个子标签可以指示样本对象执行第2个行为的真实概率。For example, the second behavioral characteristic sub-information can be input into the second task processing sub-model 222 to obtain the second output result 2221. The second output result 2221 may indicate the predicted probability of the sample object performing the second behavior. In one example, based on the second output result 2221 and the second sub-label of the sample behavior data, various loss functions can be used to determine the second sub-loss value 2222. The second sub-label can indicate the true probability of the sample object performing the second behavior.
例如,可以将第n个行为特征子信息输入第n个任务处理子模型223,得到第n个输出结果2231。第n个输出结果2231可以指示样本对象执行第n个行为的预测概率。在一个示例中,根据第n个输出结果2231和样本行为数据的第n个子标签,利用各种损失函数确定第n个子损失值2232。第n个子标签可以指示样本对象执行第n个行为的真实概率。For example, the n-th behavioral characteristic sub-information can be input into the n-th task processing sub-model 223 to obtain the n-th output result 2231. The nth output result 2231 may indicate the predicted probability of the sample object performing the nth behavior. In one example, various loss functions are used to determine the n-th sub-loss value 2232 based on the n-th output result 2231 and the n-th sub-label of the sample behavior data. The nth sub-label can indicate the true probability of the sample object performing the nth behavior.
在本公开实施例中,针对一个样本信息和样本行为数据201,可以进行人工标注,得到针对该样本信息的标签。该标签可以包括n个子标签。n个子标签例如可以包括上述的第1个子标签、第2个子标签、……、第n个子标签。In the embodiment of the present disclosure, manual annotation can be performed on a piece of sample information and sample behavior data 201 to obtain a label for the sample information. The tag can include n sub-tags. The n sub-tags may include, for example, the above-mentioned first sub-tag, second sub-tag, ..., n-th sub-tag.
可以理解,上文对获得任务处理子模型的子损失值的一些实施方式进行了详细描述。下面将结合相关实施例和图3对确定与子损失值对应的目标梯度值的方式进行详细描述。It can be understood that some implementation methods for obtaining sub-loss values of task processing sub-models are described in detail above. The method of determining the target gradient value corresponding to the sub-loss value will be described in detail below with reference to relevant embodiments and Figure 3.
图3是根据本公开的一个实施例的确定与子损失值对应的目标梯度值的示意图。FIG. 3 is a schematic diagram of determining a target gradient value corresponding to a sub-loss value according to an embodiment of the present disclosure.
如图3所示,多任务模型可以包括共享子模型310和n个任务处理子模型。n个任务处理子模型例如可以包括第1个任务处理子模型321、第2个任务处理子模型322、……、第n个任务处理子模型323。在一个示例 中,n可以为3。As shown in Figure 3, the multi-task model may include a shared sub-model 310 and n task processing sub-models. The n task processing sub-models may include, for example, the first task processing sub-model 321, the second task processing sub-model 322, ..., and the n-th task processing sub-model 323. In an example , n can be 3.
共享子模型310也可以包括多个共享层。多个共享层可以为第1个共享层、第2个共享层、……、第m个共享层,m为大于1的整数。如上文所述,第1个共享层可以处理样本行为数据,得到第1个初始行为特征信息。第2个共享层可以处理第1个初始行为特征信息,得到第2个初始行为特征信息。……第m个共享层可以处理第m-1个初始行为特征信息,得到第m个初始行为特征信息。可以将第m个初始行为特征信息作为样本对象的行为特征信息。Shared sub-model 310 may also include multiple shared layers. The multiple shared layers can be the first shared layer, the second shared layer,..., the m-th shared layer, where m is an integer greater than 1. As mentioned above, the first shared layer can process sample behavioral data and obtain the first initial behavioral characteristic information. The second shared layer can process the first initial behavioral characteristic information and obtain the second initial behavioral characteristic information. ...The m-th shared layer can process the m-1 initial behavioral feature information and obtain the m-th initial behavioral feature information. The m-th initial behavioral characteristic information can be used as the behavioral characteristic information of the sample object.
在本公开实施例中,在确定了任务处理子模型的子损失值之后,基于各种方式(例如反向传播),可以确定与该损失值对应的目标梯度值。可以理解,上述的关于第1个子损失值2212、第2个子损失值2222、……、第n个子损失值2232的详细描述,也可以适用于本实施例,本公开在此不再赘述。In embodiments of the present disclosure, after determining the sub-loss value of the task processing sub-model, based on various methods (such as backpropagation), the target gradient value corresponding to the loss value may be determined. It can be understood that the above detailed description of the first sub-loss value 2212, the second sub-loss value 2222, ..., the n-th sub-loss value 2232 can also be applied to this embodiment, and the present disclosure will not be repeated here.
例如,基于反向传播算法,根据第1个子损失值3212以及第1个任务处理子模型321的参数,可以确定第1个任务处理子模型321的梯度值。接下来,再根据共享子模型310的参数,可以确定共享子模型310的m个第一梯度值。在m个第一梯度值中,可以将第1个共享层的第一梯度值作为第1个目标梯度值3213。可以理解,在确定第1个共享层的第一梯度值的过程中,可以使用第1个共享层的相关参数。For example, based on the back propagation algorithm, based on the first sub-loss value 3212 and the parameters of the first task processing sub-model 321, the gradient value of the first task processing sub-model 321 can be determined. Next, based on the parameters of the shared sub-model 310, the m first gradient values of the shared sub-model 310 can be determined. Among the m first gradient values, the first gradient value of the first shared layer can be used as the first target gradient value 3213. It can be understood that in the process of determining the first gradient value of the first shared layer, the relevant parameters of the first shared layer can be used.
例如,基于反向传播算法,根据第2个子损失值3222以及第2个任务处理子模型322的参数,可以确定第2个任务处理子模型322的梯度值。接下来,再根据共享子模型310的参数,可以确定共享子模型310的m个第二梯度值。在m个第二梯度值中,可以将第1个共享层的第二梯度值作为第2个目标梯度值3223。可以理解,在确定第1个共享层的第二梯度值的过程中,可以使用第1个共享层的相关参数。For example, based on the back propagation algorithm, the gradient value of the second task processing sub-model 322 can be determined based on the second sub-loss value 3222 and the parameters of the second task processing sub-model 322. Next, based on the parameters of the shared sub-model 310, m second gradient values of the shared sub-model 310 can be determined. Among the m second gradient values, the second gradient value of the first shared layer can be used as the second target gradient value 3223. It can be understood that in the process of determining the second gradient value of the first shared layer, the relevant parameters of the first shared layer can be used.
例如,基于反向传播算法,根据第n个子损失值3232以及第n个任务处理子模型323的参数,可以确定第n个任务处理子模型323的梯度值。接下来,再根据共享子模型310的参数,可以确定共享子模型310的m个第n梯度值。在m个第n梯度值中,可以将第1个共享层的第n梯度值作为第n个目标梯度值3233。可以理解,在确定第1个共享层的第n梯度值 的过程中,可以使用第1个共享层的相关参数。For example, based on the back-propagation algorithm, the gradient value of the n-th task processing sub-model 323 can be determined according to the n-th sub-loss value 3232 and the parameters of the n-th task processing sub-model 323. Next, based on the parameters of the shared sub-model 310, the m nth gradient values of the shared sub-model 310 can be determined. Among the m nth gradient values, the nth gradient value of the first shared layer can be used as the nth target gradient value 3233. It can be understood that when determining the nth gradient value of the first shared layer In the process, you can use the relevant parameters of the first shared layer.
可以理解,在前向传播过程中,第1个共享层可以处理样本行为数据。在反向传播过程中,第1个共享层可以为多个共享层的最后一个共享层。It can be understood that during the forward propagation process, the first shared layer can process sample behavioral data. During the backpropagation process, the first shared layer can be the last shared layer of multiple shared layers.
在另一些实施例中,也可以将m个第一梯度值中任一个第一梯度值作为第1个目标梯度值。也可以将m个第二梯度值中任一个第二梯度值作为第2个目标梯度值。……也可以将m个第n梯度值中任一个第n梯度值作为第n个目标梯度值。In other embodiments, any first gradient value among the m first gradient values may also be used as the first target gradient value. It is also possible to use any second gradient value among the m second gradient values as the second target gradient value. ...You can also use any n-th gradient value among the m n-th gradient values as the n-th target gradient value.
可以理解,上文对确定与子损失值对应的目标梯度值的一些实施方式进行了详细描述。下面将结合相关实施例和图4对确定与子损失值对应的权重值进行详细描述。It can be understood that some embodiments of determining the target gradient value corresponding to the sub-loss value are described in detail above. Determining the weight value corresponding to the sub-loss value will be described in detail below with reference to relevant embodiments and Figure 4.
图4是根据本公开的一个实施例的多任务模型的训练方法的流程图。Figure 4 is a flowchart of a training method for a multi-task model according to one embodiment of the present disclosure.
可以理解,上述的操作S110以及操作S120也可以适用于本实施例。It can be understood that the above-mentioned operation S110 and operation S120 may also be applicable to this embodiment.
在本公开实施例中,可以将样本对象的样本行为数据输入共享子模型,得到样本对象的行为特征信息。例如,行为特征信息包括多个行为特征子信息。In the embodiment of the present disclosure, the sample behavior data of the sample object can be input into the shared sub-model to obtain the behavioral characteristic information of the sample object. For example, behavioral characteristic information includes multiple behavioral characteristic sub-information.
在本公开实施例中,根据与任务处理子模型相关的行为特征子信息,得到任务处理子模型的子损失值。例如,可以获得第1个子损失值loss_1。第1个子损失值loss_1可以是第1个任务处理子模型的子损失值。例如,可以获得第2个子损失值loss_2。第2个子损失值loss_2可以是第2个任务处理子模型的子损失值。例如,可以获得第n个子损失值loss_n。第n个子损失值loss_n可以是第n个任务处理子模型的子损失值。可以理解,上述获得第1个子损失值2212、第2个子损失值2222、……、第n个子损失值2312的方式也可以适用于本实施例,本公开在此不再赘述。In the embodiment of the present disclosure, the sub-loss value of the task processing sub-model is obtained according to the behavioral characteristic sub-information related to the task processing sub-model. For example, the first sub-loss value loss_1 can be obtained. The first sub-loss value loss_1 can be the sub-loss value of the first task processing sub-model. For example, the second sub-loss value loss_2 can be obtained. The second sub-loss value loss_2 can be the sub-loss value of the second task processing sub-model. For example, the nth sub-loss value loss_n can be obtained. The n-th sub-loss value loss_n may be the sub-loss value of the n-th task processing sub-model. It can be understood that the above-mentioned method of obtaining the first sub-loss value 2212, the second sub-loss value 2222, ..., and the n-th sub-loss value 2312 can also be applied to this embodiment, and the disclosure will not be repeated here.
接下来,可以执行操作S430、操作S440以及操作S451。Next, operations S430, S440, and S451 may be performed.
在操作S430,确定与子损失值对应的目标梯度值。In operation S430, a target gradient value corresponding to the sub-loss value is determined.
例如,可以根据第1个子损失值loss_1,确定与第1个子损失值loss_1对应的第1个目标梯度值grad_1。又例如,可以根据第2个子损失值loss_2,确定与第2个子损失值loss_2对应的第2个目标梯度值grad_2。……又例如,可以根据第n个子损失值loss_n,确定与第n个子损失值loss_n对应的第n个目标梯度值grad_n。 For example, the first target gradient value grad_1 corresponding to the first sub-loss value loss_1 can be determined based on the first sub-loss value loss_1. For another example, the second target gradient value grad_2 corresponding to the second sub-loss value loss_2 can be determined based on the second sub-loss value loss_2. ...For another example, the n-th target gradient value grad_n corresponding to the n-th sub-loss value loss_n can be determined based on the n-th sub-loss value loss_n.
可以理解,在确定第1个目标梯度值grad_1、第2个目标梯度值grad_2、……、第n个目标梯度值grad_n的过程中,可以使用共享子模型的第1个共享层的相关参数para_last_shared_layer。It can be understood that in the process of determining the first target gradient value grad_1, the second target gradient value grad_2, ..., and the nth target gradient value grad_n, the relevant parameters para_last_shared_layer of the first shared layer of the shared sub-model can be used .
可以理解,上述确定第1个目标梯度值3213、第2个目标梯度值3223、……、第n个目标梯度值3233的方式也可以适用于本实施例,本公开在此不再赘述。It can be understood that the above-mentioned method of determining the first target gradient value 3213, the second target gradient value 3223, ..., the n-th target gradient value 3233 can also be applied to this embodiment, and the disclosure will not be repeated here.
在操作S440,确定与子损失值对应的权重值。In operation S440, a weight value corresponding to the sub-loss value is determined.
在本公开实施例中,可以根据多个目标梯度值,确定与子损失值对应的权重值。In the embodiment of the present disclosure, the weight value corresponding to the sub-loss value can be determined according to multiple target gradient values.
例如,可以根据多个目标梯度值,确定处理参数值。For example, the processing parameter values can be determined based on multiple target gradient values.
例如,可以根据第1个目标梯度值grad_1、第2个目标梯度值grad_2、……、第n个目标梯度值grad_n,确定处理参数值。For example, the processing parameter value can be determined based on the first target gradient value grad_1, the second target gradient value grad_2, ..., and the nth target gradient value grad_n.
例如,根据处理参数值和与子损失值对应的目标梯度值,可以确定与子损失值对应的权重值。For example, based on the processing parameter value and the target gradient value corresponding to the sub-loss value, the weight value corresponding to the sub-loss value may be determined.
例如,对与子损失值对应的目标梯度值进行处理,可以得到处理后目标梯度值。根据处理参数值,可以对处理后目标梯度值进行归一化处理,得到归一化梯度值。根据归一化梯度值的倒数,可以确定与子损失值对应的权重值。在训练过程中,对于产生的梯度较大的任务,通过本公开实施例,该任务的权重值可以较小。在最终的损失值中,该任务的比重下降,可以有效地平衡不同任务之间的学习速度。For example, by processing the target gradient value corresponding to the sub-loss value, the processed target gradient value can be obtained. According to the processing parameter value, the processed target gradient value can be normalized to obtain the normalized gradient value. According to the reciprocal of the normalized gradient value, the weight value corresponding to the sub-loss value can be determined. During the training process, for a task that generates a larger gradient, the weight value of the task can be smaller through the embodiments of the present disclosure. In the final loss value, the proportion of the task decreases, which can effectively balance the learning speed between different tasks.
例如,可以通过以下公式确定与第i个子损失值loss_i对应的第i个权重值w_i。
For example, the i-th weight value w_i corresponding to the i-th sub-loss value loss_i can be determined by the following formula.
i可以为大于或等于1的整数,i可以为小于或等于n的整数。i can be an integer greater than or equal to 1, and i can be an integer less than or equal to n.
iexp(grad_i)可以为处理参数值。exp(grad_i)可以为第i个处理后目标梯度值。i exp(grad_i) can be a processing parameter value. exp(grad_i) can be the i-th processed target gradient value.
可以理解,根据公式一,可以确定第1个权重值w_1、第2个权重值w_2、……、第n个权重值W_n。It can be understood that according to Formula 1, the first weight value w_1, the second weight value w_2,..., and the nth weight value W_n can be determined.
在一个示例中,对第1个目标梯度值grad_1进行处理,可以得到第1 个处理后目标梯度值exp(grad_1)。根据处理参数值∑iexp(grad_i),对第1个处理后目标梯度值exp(grad_1)进行归一化,可以得到第1个归一化梯度值exp(grad_1)/∑iexp(grad_i)。可以将第1个归一化梯度值的倒数,作为第1个权重值w_1。In an example, the first target gradient value grad_1 is processed to obtain the first A processed target gradient value exp(grad_1). According to the processing parameter value ∑ i exp(grad_i), normalize the first processed target gradient value exp(grad_1) to get the first normalized gradient value exp(grad_1)/∑ i exp(grad_i) . The reciprocal of the first normalized gradient value can be used as the first weight value w_1.
在操作S451,获得损失值。In operation S451, a loss value is obtained.
在本公开实施例中,可以根据多个子损失值以及分别与多个子损失值对应的多个权重值,得到损失值。In the embodiment of the present disclosure, the loss value can be obtained based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
例如,可以根据多个子损失值以及分别与多个子损失值对应的多个权重值,进行加权求和,得到损失值。For example, the loss value can be obtained by performing a weighted sum based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
例如,可以根据第1个子损失值loss_1、第2个子损失值loss_2、……、第n个子损失值loss_n以及第1个权重值w_1、第2个权重值w_2、……、第n个权重值w_n,进行加权求和,得到损失值Loss。For example, the first sub-loss value loss_1, the second sub-loss value loss_2,..., the n-th sub-loss value loss_n and the first weight value w_1, the second weight value w_2,..., the n-th weight value can be used w_n, perform weighted summation to obtain the loss value Loss.
在一个示例中,可以通过以下公式得到损失值Loss:
Loss=loss_1*w_1+loss_2*w_2+…...+loss_n*w_n  (公式二)
In an example, the loss value Loss can be obtained by the following formula:
Loss=loss_1*w_1+loss_2*w_2+………+loss_n*w_n (Formula 2)
在本公开实施例中,可以根据损失值,训练多任务模型。In the embodiment of the present disclosure, a multi-task model can be trained according to the loss value.
例如,可以根据损失值,分别调整多个任务处理子模型以及共享子模型的参数,以训练多任务模型。在一个示例中,基于反向传播算法,可以利用损失值调整多个任务处理子模型和共享子模型的参数。For example, the parameters of multiple task processing sub-models and shared sub-models can be adjusted separately according to the loss value to train a multi-task model. In one example, based on the backpropagation algorithm, the loss value can be used to adjust parameters of multiple task processing sub-models and shared sub-models.
图5是根据本公开的另一个实施例的信息推荐方法的流程图。Figure 5 is a flow chart of an information recommendation method according to another embodiment of the present disclosure.
如图5所示,方法500可以包括操作S510至操作S520。As shown in FIG. 5 , the method 500 may include operations S510 to S520.
在操作S510,将目标对象的目标行为数据输入多任务模型,得到多个输出结果。In operation S510, the target behavior data of the target object is input into the multi-task model to obtain multiple output results.
在本公开实施例中,多任务模型可以是根据本公开提供的方法训练的。例如,多任务模型可以根据方法100训练得到的。In embodiments of the present disclosure, the multi-task model may be trained according to the method provided by the present disclosure. For example, a multi-task model can be trained according to method 100.
在本公开实施例中,输出结果可以指示目标对象对一个候选信息执行一个动作的概率。In embodiments of the present disclosure, the output result may indicate the probability that the target object performs an action on a piece of candidate information.
例如,候选信息可以为短视频、图像或文本等信息。在一个示例中,一个输出结果可以指示目标对象对该候选信息执行“点击”动作的概率。在一个示例中,另一个输出结果可以指示目标对象对该候选信息执行“评论”动作的概率。 For example, the candidate information can be short video, image or text information. In one example, an output result may indicate the probability that the target object performs a "click" action on the candidate information. In one example, another output result may indicate the probability that the target object performs a "comment" action on the candidate information.
在操作S520,根据多个输出结果,向目标对象推荐目标信息。In operation S520, target information is recommended to the target object according to the plurality of output results.
在本公开实施例中,根据多个输出结果,确定候选信息的推荐参数。根据多个候选信息的推荐参数,从多个候选信息中确定目标信息。向目标对象推荐目标信息。In the embodiment of the present disclosure, recommended parameters of candidate information are determined based on multiple output results. The target information is determined from the plurality of candidate information according to the recommended parameters of the plurality of candidate information. Recommend targeted information to target audiences.
例如,输出结果可以归一化为一个大于0且小于1的值。根据多个归一化后的输出结果,可以进行各种运算,得到一个候选信息的推荐参数。在一个示例中,各种运算例如可以包括求平均、求和、加权求和等等。For example, the output can be normalized to a value greater than 0 and less than 1. Based on multiple normalized output results, various operations can be performed to obtain recommended parameters for a candidate information. In one example, various operations may include, for example, averaging, summing, weighted sums, and the like.
例如,可以将推荐参数最大的候选信息作为目标信息,推荐给目标对象。For example, the candidate information with the largest recommendation parameter can be used as the target information and recommended to the target object.
通过本公开实施例,可以准确地向目标对象推荐信息,提高信息推送效率,提高用户体验。Through the embodiments of the present disclosure, information can be accurately recommended to target objects, information push efficiency can be improved, and user experience can be improved.
可以理解,在信息推荐等相关领域中,可以先为目标对象召回大量的候选信息,再从召回的候选信息中确定目标信息。上文对从候选信息中确定目标信息的一些实施方式进行了详细描述,下面将对召回候选信息的一些实施方式进行说明。It can be understood that in related fields such as information recommendation, a large amount of candidate information can be recalled for the target object first, and then the target information can be determined from the recalled candidate information. Some implementations of determining target information from candidate information are described in detail above, and some implementations of recalling candidate information will be described below.
在一些实施例中,可以根据目标对象的目标行为数据,从多个初始信息中确定候选信息。In some embodiments, candidate information may be determined from a plurality of initial information based on target behavior data of the target object.
例如,可以将目标行为数据转换一个目标向量。计算目标向量与初始信息的特征向量之间的相似度。在该相似度大于预设相似度阈值的情况下,可以将与该相似度对应的初始信息作为一个候选信息。For example, target behavior data can be converted into a target vector. Calculate the similarity between the target vector and the feature vector of the initial information. When the similarity is greater than the preset similarity threshold, the initial information corresponding to the similarity can be used as a candidate information.
例如,可以确定多个候选信息。For example, multiple candidate information may be determined.
可以理解,上文对本公开提供的多任务模型在信息推荐领域的应用方式进行了详细说明,但本公开不限于此。本公开提供的多任务模型也可以应用于其他领域(例如图像处理、文本处理、音频处理等)。It can be understood that the application method of the multi-task model provided by the present disclosure in the field of information recommendation is described in detail above, but the present disclosure is not limited thereto. The multi-task model provided by this disclosure can also be applied to other fields (such as image processing, text processing, audio processing, etc.).
图6是根据本公开的一个实施例的多任务模型的训练装置的框图。Figure 6 is a block diagram of a training device for a multi-task model according to an embodiment of the present disclosure.
在本公开实施例中,多任务模型可以包括多个任务处理子模型和共享子模型。In embodiments of the present disclosure, the multi-tasking model may include multiple task processing sub-models and sharing sub-models.
如图6所示,该装置600可以包括第一获得模块610、第二获得模块620、第一确定模块630、第二确定模块640和训练模块650。As shown in FIG. 6 , the device 600 may include a first obtaining module 610 , a second obtaining module 620 , a first determining module 630 , a second determining module 640 and a training module 650 .
第一获得模块610,用于将样本对象的样本行为数据输入共享子模型, 得到样本对象的行为特征信息。例如,行为特征信息包括多个行为特征子信息。The first obtaining module 610 is used to input the sample behavior data of the sample object into the shared sub-model, Obtain the behavioral characteristic information of the sample object. For example, behavioral characteristic information includes multiple behavioral characteristic sub-information.
第二获得模块620,用于根据与任务处理子模型相关的行为特征子信息,得到任务处理子模型的子损失值。The second obtaining module 620 is used to obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model.
第一确定模块630,用于根据任务处理子模型的子损失值,确定与子损失值对应的目标梯度值。The first determination module 630 is configured to determine the target gradient value corresponding to the sub-loss value according to the sub-loss value of the task processing sub-model.
第二确定模块640,用于根据多个目标梯度值,确定与子损失值对应的权重值。The second determination module 640 is used to determine the weight value corresponding to the sub-loss value according to multiple target gradient values.
训练模块650,用于根据多个子损失值以及分别与多个子损失值对应的多个权重值,训练多任务模型。The training module 650 is used to train a multi-task model based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values.
在一些实施例中,第二确定模块包括:第一确定子模块,用于根据多个目标梯度值,确定处理参数值;以及第二确定子模块,用于根据处理参数值和与子损失值对应的目标梯度值,确定与子损失值对应的权重值。In some embodiments, the second determination module includes: a first determination sub-module, used to determine the processing parameter value according to the plurality of target gradient values; and a second determination sub-module, used to determine the processing parameter value and the sub-loss value according to the processing parameter value and the sub-loss value. The corresponding target gradient value determines the weight value corresponding to the sub-loss value.
在一些实施例中,第二确定子模块包括:处理单元,用于对与子损失值对应的目标梯度值进行处理,得到处理后目标梯度值;归一化处理单元,用于根据处理参数值,对处理后目标梯度值进行归一化处理,得到归一化梯度值;以及确定单元,用于根据归一化梯度值的倒数,确定与子损失值对应的权重值。In some embodiments, the second determination sub-module includes: a processing unit, used to process the target gradient value corresponding to the sub-loss value, to obtain the processed target gradient value; a normalization processing unit, used to process the target gradient value according to the processing parameter value , normalize the processed target gradient value to obtain a normalized gradient value; and a determination unit used to determine the weight value corresponding to the sub-loss value based on the reciprocal of the normalized gradient value.
在一些实施例中,第二获得模块包括:第一获得子模块,用于将与任务处理子模型相关的行为特征子信息输入任务处理子模型,得到任务处理子模型的输出结果;以及第二获得子模块,用于根据输出结果,得到任务处理子模型的子损失值。In some embodiments, the second obtaining module includes: a first obtaining sub-module, used to input the behavioral characteristic sub-information related to the task processing sub-model into the task processing sub-model, and obtain the output result of the task processing sub-model; and a second obtaining sub-module. Obtain sub-module, which is used to obtain the sub-loss value of the task processing sub-model based on the output results.
在一些实施例中,训练模块包括:第三获得子模块,用于根据多个子损失值以及分别与多个子损失值对应的多个权重值,得到损失值;以及训练子模块,用于根据损失值,训练多任务模型。In some embodiments, the training module includes: a third obtaining sub-module, used to obtain a loss value according to multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values; and a training sub-module, used to obtain a loss value according to the loss value to train a multi-task model.
在一些实施例中,第三获得子模块包括:加权求和单元,用于根据多个子损失值以及分别与多个子损失值对应的多个权重值,进行加权求和,得到损失值。In some embodiments, the third obtaining sub-module includes: a weighted summation unit, configured to perform weighted summation based on multiple sub-loss values and multiple weight values respectively corresponding to the multiple sub-loss values, to obtain a loss value.
在一些实施例中,训练子模块包括:调整单元,用于根据损失值,分别调整多个任务处理子模型以及共享子模型的参数,以训练多任务模型。 In some embodiments, the training sub-module includes: an adjustment unit, configured to respectively adjust parameters of multiple task processing sub-models and shared sub-models according to the loss value to train the multi-task model.
图7是根据本公开的另一个实施例的信息推荐装置的框图。FIG. 7 is a block diagram of an information recommendation device according to another embodiment of the present disclosure.
如图7所示,该装置700可以包括第三获得模块710和推荐模块720。As shown in FIG. 7 , the device 700 may include a third obtaining module 710 and a recommendation module 720 .
第三获得模块710,用于将目标对象的目标行为数据输入多任务模型,得到多个输出结果。The third obtaining module 710 is used to input the target behavior data of the target object into the multi-task model to obtain multiple output results.
推荐模块720,用于根据多个输出结果,向目标对象推荐目标信息。The recommendation module 720 is used to recommend target information to target objects based on multiple output results.
例如,多任务模型是根据本公开提供的装置训练得到的。For example, the multi-task model is trained according to the device provided by the present disclosure.
根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
在本公开实施例中,电子设备包括:至少一个处理器;以及与至少一个处理器通信连接的存储器。例如,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行根据本公开提供的方法。In an embodiment of the present disclosure, an electronic device includes: at least one processor; and a memory communicatively connected with the at least one processor. For example, the memory stores instructions executable by at least one processor, and the instructions are executed by at least one processor, so that the at least one processor can perform the method provided according to the present disclosure.
在本公开实施例中,可读存储介质存储有计算机指令,可读存储介质可以为非瞬时计算机可读存储介质。例如,计算机指令可以使计算机执行根据本公开提供的方法。In embodiments of the present disclosure, the readable storage medium stores computer instructions, and the readable storage medium may be a non-transitory computer-readable storage medium. For example, computer instructions may cause a computer to perform methods provided in accordance with the present disclosure.
在本公开实施例中,计算机程序产品包括计算机程序,该计算机程序在被处理器执行时实现根据本公开提供的方法。下面将结合图8进行详细说明。In embodiments of the present disclosure, a computer program product includes a computer program that, when executed by a processor, implements a method provided in accordance with the present disclosure. Detailed description will be given below with reference to Figure 8 .
图8示出了可以用来实施本公开的实施例的示例电子设备800的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。Figure 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.
如图8所示,设备800包括计算单元801,其可以根据存储在只读存储器(ROM)802中的计算机程序或者从存储单元808加载到随机访问存储器(RAM)803中的计算机程序,来执行各种适当的动作和处理。在RAM 803中,还可存储设备800操作所需的各种程序和数据。计算单元801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O) 接口805也连接至总线804。As shown in FIG. 8 , the device 800 includes a computing unit 801 that can execute according to a computer program stored in a read-only memory (ROM) 802 or loaded from a storage unit 808 into a random access memory (RAM) 803 Various appropriate actions and treatments. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. Computing unit 801, ROM 802 and RAM 803 are connected to each other via bus 804. Input/output (I/O) Interface 805 is also connected to bus 804.
设备800中的多个部件连接至I/O接口805,包括:输入单元806,例如键盘、鼠标等;输出单元807,例如各种类型的显示器、扬声器等;存储单元808,例如磁盘、光盘等;以及通信单元809,例如网卡、调制解调器、无线通信收发机等。通信单元809允许设备800通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Multiple components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, optical disk, etc. ; and communication unit 809, such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.
计算单元801可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元801的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元801执行上文所描述的各个方法和处理,例如多任务模型的训练方法和/或信息推荐方法。例如,在一些实施例中,多任务模型的训练方法和/或信息推荐方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元808。在一些实施例中,计算机程序的部分或者全部可以经由ROM 802和/或通信单元809而被载入和/或安装到设备800上。当计算机程序加载到RAM 803并由计算单元801执行时,可以执行上文描述的多任务模型的训练方法和/或信息推荐方法的一个或多个步骤。备选地,在其他实施例中,计算单元801可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行多任务模型的训练方法和/或信息推荐方法。Computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processing processor (DSP), and any appropriate processor, controller, microcontroller, etc. The computing unit 801 performs various methods and processes described above, such as a multi-task model training method and/or an information recommendation method. For example, in some embodiments, the multi-task model training method and/or the information recommendation method may be implemented as a computer software program, which is tangibly included in a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the training method of the multi-task model and/or the information recommendation method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the multi-task model training method and/or the information recommendation method in any other suitable manner (eg, by means of firmware).
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、复杂可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。 Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip implemented in a system (SOC), complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor The processor, which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)显示器或者LCD(液晶显示器));以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (Cathode Ray Tube) display or an LCD (Liquid Crystal Display)) for displaying information to the user. ; and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质 的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. in any form or medium Digital data communications (e.g., communications networks) to connect the components of the system to each other. Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,也可以为分布式系统的服务器,或者是结合了区块链的服务器。Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other. The server can be a cloud server, a distributed system server, or a server combined with a blockchain.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that various forms of the process shown above may be used, with steps reordered, added or deleted. For example, each step described in the present disclosure can be executed in parallel, sequentially, or in a different order. As long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, there is no limitation here.
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。 The above-mentioned specific embodiments do not constitute a limitation on the scope of the present disclosure. It will be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions are possible depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this disclosure shall be included in the protection scope of this disclosure.

Claims (19)

  1. 一种多任务模型的训练方法,所述多任务模型包括多个任务处理子模型和共享子模型,所述方法包括:A training method for a multi-task model. The multi-task model includes multiple task processing sub-models and shared sub-models. The method includes:
    将样本对象的样本行为数据输入所述共享子模型,得到所述样本对象的行为特征信息,其中,所述行为特征信息包括多个行为特征子信息;Input the sample behavior data of the sample object into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes a plurality of behavioral characteristic sub-information;
    根据与所述任务处理子模型相关的所述行为特征子信息,得到所述任务处理子模型的子损失值;Obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model;
    根据所述任务处理子模型的子损失值,确定与所述子损失值对应的目标梯度值;According to the sub-loss value of the task processing sub-model, determine the target gradient value corresponding to the sub-loss value;
    根据多个所述目标梯度值,确定与所述子损失值对应的权重值;以及Determine a weight value corresponding to the sub-loss value according to a plurality of the target gradient values; and
    根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,训练所述多任务模型。The multi-task model is trained according to a plurality of sub-loss values and a plurality of weight values respectively corresponding to the plurality of sub-loss values.
  2. 根据权利要求1所述的方法,其中,所述根据多个所述目标梯度值,确定与所述子损失值对应的权重值包括:The method according to claim 1, wherein determining the weight value corresponding to the sub-loss value according to a plurality of the target gradient values includes:
    根据多个所述目标梯度值,确定处理参数值;以及determining processing parameter values based on a plurality of the target gradient values; and
    根据所述处理参数值和与所述子损失值对应的所述目标梯度值,确定与所述子损失值对应的所述权重值。The weight value corresponding to the sub-loss value is determined according to the processing parameter value and the target gradient value corresponding to the sub-loss value.
  3. 根据权利要求2所述的方法,其中,所述根据所述处理参数值和与所述子损失值对应的目标梯度值,确定与所述子损失值对应的所述权重值包括:The method of claim 2, wherein determining the weight value corresponding to the sub-loss value according to the processing parameter value and the target gradient value corresponding to the sub-loss value includes:
    对与所述子损失值对应的目标梯度值进行处理,得到处理后目标梯度值;Process the target gradient value corresponding to the sub-loss value to obtain the processed target gradient value;
    根据所述处理参数值,对所述处理后目标梯度值进行归一化处理,得到归一化梯度值;以及According to the processing parameter value, normalize the processed target gradient value to obtain a normalized gradient value; and
    根据所述归一化梯度值的倒数,确定与所述子损失值对应的所述权重值。The weight value corresponding to the sub-loss value is determined according to the reciprocal of the normalized gradient value.
  4. 根据权利要求1所述的方法,其中,所述根据与所述任务处理子模型相关的所述行为特征子信息,得到所述任务处理子模型的子损失值包括:The method according to claim 1, wherein obtaining the sub-loss value of the task processing sub-model based on the behavioral characteristic sub-information related to the task processing sub-model includes:
    将与所述任务处理子模型相关的所述行为特征子信息输入所述任务 处理子模型,得到所述任务处理子模型的输出结果;以及Input the behavioral characteristic sub-information related to the task processing sub-model into the task Process the sub-model to obtain the output result of the task processing sub-model; and
    根据所述输出结果,得到所述任务处理子模型的所述子损失值。According to the output result, the sub-loss value of the task processing sub-model is obtained.
  5. 根据权利要求1所述的方法,其中,所述根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,训练所述多任务模型包括:The method according to claim 1, wherein training the multi-task model according to a plurality of the sub-loss values and a plurality of the weight values corresponding to the plurality of sub-loss values respectively includes:
    根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,得到损失值;以及Obtain a loss value according to a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values; and
    根据所述损失值,训练所述多任务模型。According to the loss value, the multi-task model is trained.
  6. 根据权利要求5所述的方法,其中,所述根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,得到损失值包括:The method according to claim 5, wherein obtaining the loss value based on a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values includes:
    根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,进行加权求和,得到所述损失值。The loss value is obtained by performing a weighted sum according to the plurality of sub-loss values and the plurality of weight values respectively corresponding to the plurality of sub-loss values.
  7. 根据权利要求5所述的方法,其中,所述根据所述损失值,训练所述多任务模型包括:The method of claim 5, wherein training the multi-task model according to the loss value includes:
    根据所述损失值,分别调整多个所述任务处理子模型以及所述共享子模型的参数,以训练所述多任务模型。According to the loss value, parameters of a plurality of the task processing sub-models and the shared sub-model are respectively adjusted to train the multi-task model.
  8. 一种信息推荐方法,包括:An information recommendation method includes:
    将目标对象的目标行为数据输入多任务模型,得到多个输出结果;Input the target behavior data of the target object into the multi-task model to obtain multiple output results;
    根据所述多个输出结果,向所述目标对象推荐目标信息,recommend target information to the target object according to the multiple output results,
    其中,所述多任务模型是根据权利要求1至7任一项所述的方法训练得到的。Wherein, the multi-task model is trained according to the method described in any one of claims 1 to 7.
  9. 一种多任务模型的训练装置,所述多任务模型包括多个任务处理子模型和共享子模型,所述装置包括:A training device for a multi-task model. The multi-task model includes multiple task processing sub-models and shared sub-models. The device includes:
    第一获得模块,用于将样本对象的样本行为数据输入所述共享子模型,得到所述样本对象的行为特征信息,其中,所述行为特征信息包括多个行为特征子信息;The first acquisition module is used to input the sample behavior data of the sample object into the shared sub-model to obtain the behavioral characteristic information of the sample object, where the behavioral characteristic information includes a plurality of behavioral characteristic sub-information;
    第二获得模块,用于根据与所述任务处理子模型相关的所述行为特征子信息,得到所述任务处理子模型的子损失值;The second acquisition module is used to obtain the sub-loss value of the task processing sub-model according to the behavioral characteristic sub-information related to the task processing sub-model;
    第一确定模块,用于根据所述任务处理子模型的子损失值,确定与所述子损失值对应的目标梯度值; A first determination module, configured to determine the target gradient value corresponding to the sub-loss value according to the sub-loss value of the task processing sub-model;
    第二确定模块,用于根据多个所述目标梯度值,确定与所述子损失值对应的权重值;以及A second determination module, configured to determine the weight value corresponding to the sub-loss value according to a plurality of the target gradient values; and
    训练模块,用于根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,训练所述多任务模型。A training module configured to train the multi-task model according to a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values.
  10. 根据权利要求9所述的装置,其中,所述第二确定模块包括:The device according to claim 9, wherein the second determination module includes:
    第一确定子模块,用于根据多个所述目标梯度值,确定处理参数值;以及The first determination sub-module is used to determine the processing parameter value according to a plurality of the target gradient values; and
    第二确定子模块,用于根据所述处理参数值和与所述子损失值对应的所述目标梯度值,确定与所述子损失值对应的所述权重值。The second determination sub-module is used to determine the weight value corresponding to the sub-loss value according to the processing parameter value and the target gradient value corresponding to the sub-loss value.
  11. 根据权利要求10所述的装置,其中,所述第二确定子模块包括:The device according to claim 10, wherein the second determination sub-module includes:
    处理单元,用于对与所述子损失值对应的目标梯度值进行处理,得到处理后目标梯度值;A processing unit, configured to process the target gradient value corresponding to the sub-loss value to obtain the processed target gradient value;
    归一化处理单元,用于根据所述处理参数值,对所述处理后目标梯度值进行归一化处理,得到归一化梯度值;以及A normalization processing unit, configured to normalize the processed target gradient value according to the processing parameter value to obtain a normalized gradient value; and
    确定单元,用于根据所述归一化梯度值的倒数,确定与所述子损失值对应的所述权重值。A determining unit, configured to determine the weight value corresponding to the sub-loss value according to the reciprocal of the normalized gradient value.
  12. 根据权利要求9所述的装置,其中,所述第二获得模块包括:The device according to claim 9, wherein the second obtaining module includes:
    第一获得子模块,用于将与所述任务处理子模型相关的所述行为特征子信息输入所述任务处理子模型,得到所述任务处理子模型的输出结果;以及The first acquisition sub-module is used to input the behavioral characteristic sub-information related to the task processing sub-model into the task processing sub-model, and obtain the output result of the task processing sub-model; and
    第二获得子模块,用于根据所述输出结果,得到所述任务处理子模型的所述子损失值。The second obtaining sub-module is used to obtain the sub-loss value of the task processing sub-model according to the output result.
  13. 根据权利要求9所述的装置,其中,所述训练模块包括:The device of claim 9, wherein the training module includes:
    第三获得子模块,用于根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,得到损失值;以及The third obtaining sub-module is used to obtain a loss value based on a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values; and
    训练子模块,用于根据所述损失值,训练所述多任务模型。A training submodule is used to train the multi-task model according to the loss value.
  14. 根据权利要求13所述的装置,其中,所述第三获得子模块包括:The device according to claim 13, wherein the third obtaining sub-module includes:
    加权求和单元,用于根据多个所述子损失值以及分别与多个所述子损失值对应的多个所述权重值,进行加权求和,得到所述损失值。A weighted summation unit is configured to perform weighted summation based on a plurality of the sub-loss values and a plurality of the weight values respectively corresponding to the plurality of sub-loss values, to obtain the loss value.
  15. 根据权利要求13所述的装置,其中,所述训练子模块包括: The device according to claim 13, wherein the training sub-module includes:
    调整单元,用于根据所述损失值,分别调整多个所述任务处理子模型以及所述共享子模型的参数,以训练所述多任务模型。An adjustment unit, configured to adjust parameters of a plurality of the task processing sub-models and the shared sub-model respectively according to the loss value to train the multi-task model.
  16. 一种信息推荐装置,包括:An information recommendation device including:
    第三获得模块,用于将目标对象的目标行为数据输入多任务模型,得到多个输出结果;The third acquisition module is used to input the target behavior data of the target object into the multi-task model to obtain multiple output results;
    推荐模块,用于根据所述多个输出结果,向所述目标对象推荐目标信息,a recommendation module, configured to recommend target information to the target object based on the multiple output results,
    其中,所述多任务模型是根据权利要求9至15任一项所述的装置训练得到的。Wherein, the multi-task model is trained according to the device according to any one of claims 9 to 15.
  17. 一种电子设备,包括:An electronic device including:
    至少一个处理器;以及at least one processor; and
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively connected to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1至8中任一项所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1 to 8 Methods.
  18. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1至8中任一项所述的方法。A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method according to any one of claims 1 to 8.
  19. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至8中任一项所述方法的步骤。 A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 8 are implemented.
PCT/CN2023/074122 2022-08-24 2023-02-01 Multi-task model training method, information recommendation method, apparatus, and device WO2024040869A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211015938.1A CN115081630A (en) 2022-08-24 2022-08-24 Training method of multi-task model, information recommendation method, device and equipment
CN202211015938.1 2022-08-24

Publications (1)

Publication Number Publication Date
WO2024040869A1 true WO2024040869A1 (en) 2024-02-29

Family

ID=83245010

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074122 WO2024040869A1 (en) 2022-08-24 2023-02-01 Multi-task model training method, information recommendation method, apparatus, and device

Country Status (2)

Country Link
CN (1) CN115081630A (en)
WO (1) WO2024040869A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115081630A (en) * 2022-08-24 2022-09-20 北京百度网讯科技有限公司 Training method of multi-task model, information recommendation method, device and equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541124A (en) * 2020-12-24 2021-03-23 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for generating a multitask model
CN112561077A (en) * 2020-12-14 2021-03-26 北京百度网讯科技有限公司 Training method and device of multi-task model and electronic equipment
WO2022019913A1 (en) * 2020-07-23 2022-01-27 Google Llc Systems and methods for generation of machine-learned multitask models
CN114913371A (en) * 2022-05-10 2022-08-16 平安科技(深圳)有限公司 Multitask learning model training method and device, electronic equipment and storage medium
CN115081630A (en) * 2022-08-24 2022-09-20 北京百度网讯科技有限公司 Training method of multi-task model, information recommendation method, device and equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209817B (en) * 2019-05-31 2023-06-09 安徽省泰岳祥升软件有限公司 Training method and device for text processing model and text processing method
CN111027428B (en) * 2019-11-29 2024-03-08 北京奇艺世纪科技有限公司 Training method and device for multitasking model and electronic equipment
CN112559007B (en) * 2020-12-14 2022-09-23 北京百度网讯科技有限公司 Parameter updating method and device of multitask model and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022019913A1 (en) * 2020-07-23 2022-01-27 Google Llc Systems and methods for generation of machine-learned multitask models
CN112561077A (en) * 2020-12-14 2021-03-26 北京百度网讯科技有限公司 Training method and device of multi-task model and electronic equipment
CN112541124A (en) * 2020-12-24 2021-03-23 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for generating a multitask model
CN114913371A (en) * 2022-05-10 2022-08-16 平安科技(深圳)有限公司 Multitask learning model training method and device, electronic equipment and storage medium
CN115081630A (en) * 2022-08-24 2022-09-20 北京百度网讯科技有限公司 Training method of multi-task model, information recommendation method, device and equipment

Also Published As

Publication number Publication date
CN115081630A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
US10956748B2 (en) Video classification method, information processing method, and server
JP2022058915A (en) Method and device for training image recognition model, method and device for recognizing image, electronic device, storage medium, and computer program
CN110390408B (en) Transaction object prediction method and device
CN112559800B (en) Method, apparatus, electronic device, medium and product for processing video
US10445586B2 (en) Deep learning on image frames to generate a summary
US20220374776A1 (en) Method and system for federated learning, electronic device, and computer readable medium
CN114861889B (en) Deep learning model training method, target object detection method and device
US20230079275A1 (en) Method and apparatus for training semantic segmentation model, and method and apparatus for performing semantic segmentation on video
US20230215136A1 (en) Method for training multi-modal data matching degree calculation model, method for calculating multi-modal data matching degree, and related apparatuses
WO2024040869A1 (en) Multi-task model training method, information recommendation method, apparatus, and device
CN114020950A (en) Training method, device and equipment of image retrieval model and storage medium
KR20220010045A (en) Domain phrase mining method, equipment and electronic device
CN115147680B (en) Pre-training method, device and equipment for target detection model
CN115496970A (en) Training method of image task model, image recognition method and related device
CN114494747A (en) Model training method, image processing method, device, electronic device and medium
CN114037059A (en) Pre-training model, model generation method, data processing method and data processing device
CN113792876A (en) Backbone network generation method, device, equipment and storage medium
CN113657411A (en) Neural network model training method, image feature extraction method and related device
CN115880506A (en) Image generation method, model training method and device and electronic equipment
CN114926322A (en) Image generation method and device, electronic equipment and storage medium
CN114282049A (en) Video retrieval method, device, equipment and storage medium
US20240037410A1 (en) Method for model aggregation in federated learning, server, device, and storage medium
CN115131709B (en) Video category prediction method, training method and device for video category prediction model
US20240029416A1 (en) Method, device, and computer program product for image processing
CN114331379B (en) Method for outputting task to be handled, model training method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23855966

Country of ref document: EP

Kind code of ref document: A1