CN115905872A

CN115905872A - Model training method, information recommendation method, device, equipment and medium

Info

Publication number: CN115905872A
Application number: CN202211627517.4A
Authority: CN
Inventors: 刘洁; 李腾; 高家华
Original assignee: Weimeng Chuangke Network Technology China Co Ltd
Current assignee: Weimeng Chuangke Network Technology China Co Ltd
Priority date: 2022-12-16
Filing date: 2022-12-16
Publication date: 2023-04-04

Abstract

The disclosure relates to a model training method, an information recommendation device, equipment and a medium. The method comprises the following steps: acquiring a training sample data set; performing iterative training on the initial multi-task model by using a training sample data set until a convergence condition is met to obtain a trained multi-task model, wherein each iterative training process comprises the following steps: performing feature extraction on each training sample by using an initial feature extraction layer in the initial multi-task model to obtain a first feature; performing feature processing on the first feature by utilizing an expert layer in the initial multitask model to obtain a second feature; based on the second characteristics and the task sub-networks corresponding to the tasks in the initial multi-task model, obtaining a prediction result of each training sample output by each task sub-network for the corresponding task; and adjusting parameters of the initial multi-task model based on the prediction result output by each task sub-network and the labels of the tasks corresponding to each training sample. By the method, the model training and convergence speed can be improved.

Description

Model training method, information recommendation method, device, equipment and medium

Technical Field

The present disclosure relates to, but not limited to, the technical field of artificial intelligence, and in particular, to a training method, an information recommendation apparatus, an electronic device, and a storage medium for a multitask model.

Background

The multi-task learning is to put a plurality of related tasks together for learning, and the related information of the learned fields is shown to be mutually shared and mutually supplemented through one sharing in the shallow layer in the learning process, so that the learning is mutually promoted, and the generalization effect is improved. Compared with a single-task model, the multi-task model can fully learn the relevance among tasks, so that the overall effect of each task is better; and because a plurality of tasks share one model, the occupied memory is smaller, and the service inference speed is higher. How to further improve the effect of the multitask model is still a current concern.

Disclosure of Invention

The disclosure provides a training method of a multitask model, an information recommendation method, an information recommendation device, electronic equipment and a storage medium.

According to a first aspect of the embodiments of the present disclosure, there is provided a method for training a multitask model, including:

acquiring a training sample data set; the training sample data set comprises a plurality of training samples relevant to all tasks and labels of all tasks corresponding to all the training samples;

performing iterative training on the initial multi-task model by using the training sample data set until a convergence condition is met to obtain a trained multi-task model, wherein the process of each iterative training comprises the following steps:

performing feature extraction on each training sample in the training sample data set by using an initial feature extraction layer in the initial multitask model to obtain a first feature output by the initial feature extraction layer;

performing feature processing on the first features by utilizing an expert layer in the initial multitask model to obtain second features output by the expert layer; the expert layer comprises a plurality of feature processing layers, one feature processing layer comprises a plurality of feature processing sub-networks, the feature processing sub-networks in the adjacent feature processing layers are connected in a one-to-one correspondence manner, and each feature processing sub-network is used for processing and outputting input features;

based on the second characteristics output by the expert layer and a task sub-network corresponding to each task in the initial multi-task model, obtaining a prediction result of each training sample output by each task sub-network for the corresponding task;

and adjusting parameters of the initial multi-task model based on the prediction result of each training sample output by each task sub-network for the corresponding task and the label of each task corresponding to each training sample in the training sample data set.

In some embodiments, the second feature output by the expert layer comprises: the sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer in the expert layers;

the obtaining a prediction result of each training sample output by each task sub-network for a corresponding task based on the second feature output by the expert layer and the task sub-network corresponding to each task in the initial multi-task model includes:

for each task sub-network in the initial multi-task model, screening the sub-characteristics output by each feature processing sub-network included in the feature processing layer of the highest layer by using at least binary coding variables to obtain third characteristics corresponding to the task sub-networks;

and utilizing each task sub-network to respectively learn the third characteristics corresponding to the task sub-networks to obtain the prediction result of each training sample output by each task sub-network for the corresponding task.

In some embodiments, the screening, by using at least a binary coding variable, the sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer to obtain a third feature corresponding to the task sub-network includes:

and screening and weighting the sub-characteristics output by each feature processing sub-network included in the feature processing layer of the highest layer by using the binary coding variables and the weight matrix to obtain a third characteristic corresponding to the task sub-network.

In some embodiments, the performing the feature processing on the first feature by using the expert layer in the initial multitasking model to obtain the second feature output by the expert layer includes:

respectively performing weighting processing and nonlinear mapping on the first features by using each feature processing sub-network included in the first feature processing layer to obtain intermediate features output by each feature processing sub-network in the first feature processing layer;

and performing weighting processing and nonlinear mapping on the intermediate features output by the corresponding feature processing sub-networks in the first feature processing layer by using each feature processing sub-network included in the second feature processing layer to obtain the second features output by the expert layer.

In some embodiments, the adjusting parameters of the initial multitask model based on a prediction result of each training sample output by each task subnetwork for a corresponding task and a label of each task corresponding to each training sample in the training sample data set includes:

for each task, calculating a loss value of the task based on a prediction result of each training sample output by a task sub-network of the task for the task and a label of each training sample in the training sample data set for the task;

determining a total loss of the initial multi-tasking model based on the loss value of each task;

adjusting parameters of the initial multitasking model based on the total loss.

According to a second aspect of an embodiment of the present disclosure, there is provided an information recommendation method including:

acquiring user information of a target user and media information to be recommended;

processing the user information of the target user and each piece of media information by using a multitask model to obtain the probability of each behavior of the target user on each piece of media information; wherein the multitask model is obtained based on the method training of the first aspect;

determining the interest degree of the target user in each piece of media information based on the probability of each action of the target user on each piece of media information;

and ranking the interest degree of each piece of media information, and determining target media information recommended to the target user from each piece of media information according to a ranking result.

According to a third aspect of the embodiments of the present disclosure, there is provided a training apparatus for a multitask model, including:

the first acquisition module is configured to acquire a training sample data set; the training sample data set comprises a plurality of training samples relevant to each task and labels of each task corresponding to each training sample;

the training module is configured to use the sample data training set to perform iterative training on the initial multi-task model until a convergence condition is met to obtain a trained multi-task model, and the process of each iterative training comprises the following steps:

the feature extraction module is configured to perform feature extraction on each training sample in the training sample data set by using an initial feature extraction layer in the initial multitask model to obtain a first feature output by the initial feature extraction layer;

the characteristic processing module is configured to perform characteristic processing on the first characteristic by utilizing an expert layer in the initial multitask model to obtain a second characteristic output by the expert layer; the expert layer comprises a plurality of feature processing layers, one feature processing layer comprises a plurality of feature processing sub-networks, the feature processing sub-networks in the adjacent feature processing layers are connected in a one-to-one correspondence manner, and each feature processing sub-network is used for processing and outputting the input features;

the prediction module is configured to obtain a prediction result of each training sample output by each task sub-network for a corresponding task based on a second feature output by the expert layer and the task sub-network corresponding to each task in the initial multi-task model;

and the parameter adjusting module is configured to adjust parameters of the initial multi-task model based on a prediction result of each training sample output by each task sub-network for a corresponding task and a label of each task corresponding to each training sample in the training sample data set.

the prediction module is further configured to, for each task sub-network in the initial multi-task model, at least utilize binary coding variables to screen sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer, so as to obtain third features corresponding to the task sub-networks; and utilizing each task sub-network to respectively learn the third characteristics corresponding to the task sub-networks to obtain the prediction result of each training sample output by each task sub-network for the corresponding task.

In some embodiments, the prediction module is further configured to perform screening and weighting processing on the sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer by using the binary coding variables and the weight matrix, so as to obtain a third feature corresponding to the task sub-network.

In some embodiments, the expert layer includes a first feature processing layer and a second feature processing layer, and the feature processing module is further configured to perform weighting processing and nonlinear mapping on the first feature by using each feature processing sub-network included in the first feature processing layer, so as to obtain an intermediate feature output by each feature processing sub-network in the first feature processing layer; and performing weighting processing and nonlinear mapping on the intermediate features output by the corresponding feature processing sub-networks in the first feature processing layer by using each feature processing sub-network included in the second feature processing layer to obtain the second features output by the expert layer.

In some embodiments, the parameter tuning module is further configured to calculate, for each of the tasks, a loss value for the task based on a prediction result for the task for each training sample output by a task sub-network of the task and a label for the task for each training sample in the set of training sample data; determining a total loss of the initial multi-tasking model based on the loss value of each task; adjusting parameters of the initial multitasking model based on the total loss.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an information recommendation apparatus including:

the second acquisition module is configured to acquire the user information of the target user and the media information to be recommended;

the processing module is configured to process the user information of the target user and each piece of media information by using a multitask model to obtain the probability of each behavior of the target user on each piece of media information; wherein the multitask model is obtained based on the method training of the first aspect;

the determining module is configured to determine the interest degree of the target user in each piece of media information based on the probability of each action of the target user on each piece of media information;

and the recommending module is configured to rank the interest degree of each piece of media information and determine the target media information recommended to the target user from each piece of media information according to a ranking result.

According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform a training method of a multitask model as described in the first aspect above, or an information recommendation method as described in the second aspect.

According to a sixth aspect of embodiments of the present disclosure, there is provided a storage medium comprising:

the instructions in said storage medium, when executed by a processor of an electronic device, enable the electronic device to perform a method of training a multitask model as described in the first aspect above, or a method of information recommendation as described in the second aspect.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in the embodiment of the disclosure, in the feature processing layer at the bottommost layer connected with the initial feature extraction layer inside the expert layer, each feature processing sub-network independently learns, and the feature processing sub-networks in the adjacent feature processing layers are connected in a one-to-one correspondence manner, so that on one hand, the independent learning expression capability of each expert is enhanced, the output of each feature sub-network in the feature processing layer at the high layer can carry part of important feature information, and different selection emphasis is conveniently provided for different tasks (for example, task sub-networks corresponding to tasks); on the other hand, modes are not shared in the expert layer, so that the parameter quantity is reduced, and the model training and convergence speed is higher.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a diagram illustrating a Hard Shared model.

FIG. 2 is an exemplary MMOE model architecture.

FIG. 3 is a diagram showing an example of the structure of an SNR-Trans model.

Fig. 4 is a flowchart of a training method of a multitask model according to an embodiment of the present disclosure.

Fig. 5 is a diagram illustrating a structure of a multitasking model according to an embodiment of the disclosure.

Fig. 6 is a flowchart of an information recommendation method according to an embodiment of the disclosure.

FIG. 7 is a diagram illustrating a training apparatus for a multitask model according to an exemplary embodiment.

Fig. 8 is a diagram illustrating an information recommendation device according to an example embodiment.

FIG. 9 is a block diagram illustrating a server in accordance with an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

In the related art, the multi-task modeling technology iteration is that a plurality of tasks share a part of parameters and have independent parameters, so that the phenomenon that the performance of some tasks needs to be improved at the cost of reducing the performance of other tasks is relieved, and the phenomenon is also called an inter-task seesaw effect. For the design of a model structure, the design of a parameter sharing mechanism is the most core solution point in the improvement of the multi-task model effect.

First, a Hard parameter sharing (Hard Shared) method is used, and fig. 1 is an exemplary diagram of a Hard Shared model structure, as shown in fig. 1, which includes an Input layer (Input layer), a Shared Bottom layer (Shared Bottom layer), and a Tower layer (Tower layer), where the Tower layer includes a Tower a (Tower a) and a Tower B (Tower B) corresponding to two tasks (tasks), and different tasks train different towers. Wherein, the input layer is a feature matrix formed based on the features of the sample (for example, a two-dimensional matrix formed by embedding the features of the sample); the shared bottom layer can be a full connection layer and is used for further processing the characteristics of the input layer; the task independent corresponding tower is used for performing the characteristic processing of the associated task so as to output the characteristics suitable for the task.

And in the hard parameter sharing mode, the output corresponding to the task is obtained by sharing the network parameters of the input layer and the shared bottom layer and splitting the tower corresponding to the task independently according to the number of the tasks. The shared bottom layer is introduced, so that the number of parameters of the model to be trained independently can be reduced, overfitting can be prevented, and the model training speed can be increased. However, the model is suitable for the problem of no conflict between tasks, for example, the tasks are similar, and if the tasks are dissimilar, the shared layer can make the prediction result worse.

Subsequently, a Multi-gate mixed-of-Experts (MMOE) model structure is iteratively presented, and fig. 2 is an exemplary diagram of an MMOE model structure, as shown in fig. 2, including an Input layer (Input layer), an expert layer (including expert 0, expert 1, and expert 2), a gate layer (including gate a and gate B), and a tower layer (including tower a and tower B to which two tasks correspond, respectively). Compared with the Hard Shared model, the MMOE model is mainly changed by setting experts and gating mechanisms, wherein one expert is equivalent to a multilayer neural network, and one expert outputs one feature vector, and as shown in FIG. 2, if 3 experts exist, 3 feature vectors exist in total; a gate can be a shallow neural network, each expert is assigned a weight parameter (the sum of the weights assigned by all gates is 1), and the result of weighting the characteristics of a plurality of experts is input into the tower layer.

The multi-gating mixed expert model has the advantages that an expert mechanism is added, so that more information can be extracted, and in addition, a gating mechanism is introduced, so that each task has different weights for each expert to achieve a more flexible parameter sharing mode.

In order to further improve the flexibility of parameter sharing, a Sub-Network-Routing (SNR) model is then iterated on the basis of the foregoing model, and the SNR model uses binary coding variables to control connections between subnetworks, including two connection modes: SNR-Trans and SNR-Aver. Fig. 3 is an exemplary diagram of an SNR-Trans model structure, as shown in fig. 3, which includes an Input layer (Input layer), an expert layer (including 6 experts including expert 0, expert 1, expert 2, expert 3, expert 4, and expert 5 in the figure), and a tower layer (including tower a and tower B corresponding to two tasks, respectively). Wherein, the input layer and the expert layer are connected by a common full-connection network; the expert layer comprises two layers, and a coding layer thought is proposed inside the expert layer, namely, the characteristics extracted by each expert (expert 0-expert 2) on the lower layer have a parameter of 0 or 1 for controlling whether the expert on the upper layer (expert 3-expert 5) takes effect or not, and the parameter can be updated through gradient feedback, so that the expert on the upper layer is connected with more selection spaces, and meanwhile, the inside of the expert is shared; when the expert layer is connected with the tower layer, the weight parameters are used for combining the experts, the weight parameters can also be updated through gradient feedback, and the result after the characteristics of the experts are weighted is input to the tower corresponding to the task.

The SNR model has the advantages that the parameter sharing between tasks is more flexible, and the seesaw effect between independent tasks is relieved to a certain extent. However, compared with the model shown in fig. 1 or fig. 2, the model structure is more complex, on one hand, the introduction of the coding layer idea makes the parameters to be learned more, which results in slow convergence of the model; on the other hand, the expert layer is internally shared, and the expert expression capability is insufficient.

To this end, an embodiment of the present disclosure provides a method for training a multitask model, and fig. 4 is a flowchart of the method for training the multitask model provided by the embodiment of the present disclosure, as shown in fig. 4, the method for training the multitask model includes the following steps:

s11, acquiring a training sample data set; the training sample data set comprises a plurality of training samples relevant to each task and labels of each task corresponding to each training sample;

s12, performing iterative training on the initial multi-task model by using the training sample data set until a convergence condition is met to obtain a trained multi-task model, wherein the process of each iterative training comprises the following steps:

s12a, performing feature extraction on each training sample in the training sample data set by using an initial feature extraction layer in an initial multi-task model to obtain a first feature output by the initial feature extraction layer;

s12b, performing feature processing on the first features by utilizing an expert layer in the initial multitask model to obtain second features output by the expert layer; the expert layer comprises a plurality of feature processing layers, one feature processing layer comprises a plurality of feature processing sub-networks, the feature processing sub-networks in the adjacent feature processing layers are connected in a one-to-one correspondence manner, and each feature processing sub-network is used for processing and outputting the input features;

s12c, obtaining a prediction result of each training sample output by each task sub-network aiming at the corresponding task based on the second characteristics output by the expert layer and the task sub-network corresponding to each task in the initial multi-task model;

and S12d, adjusting parameters of the initial multi-task model based on the prediction result of each training sample output by each task sub-network aiming at the corresponding task and the label of each task corresponding to each training sample in the training sample data set.

It should be noted that the method for training a multitask model according to the embodiment of the present disclosure may be applied to a device for training a multitask model according to the embodiment of the present disclosure, and the device may be configured in an electronic device, for example, the electronic device is a server device or the like.

In step S11, the training sample data set acquired by the electronic device not only includes a plurality of training samples, but also includes labels of each task corresponding to each training sample.

Taking the application of the multitask model of the embodiment of the present disclosure to an obstacle avoidance scene in the vehicle driving process as an example, the multitask may include: positioning a center point of the obstacle, positioning an angular point of the obstacle, predicting a boundary frame of the obstacle, and the like, wherein the training samples comprise image samples in a vehicle driving visual field range, and the label of each task corresponding to each training sample comprises: the method comprises the steps of obtaining an image sample, and obtaining an image sample, wherein the image sample comprises an obstacle center point positioning label, an obstacle corner point positioning label, an obstacle boundary frame prediction label and the like.

Taking the application of the multitask model of the embodiment of the disclosure to the recommendation system as an example, because the display feedback in the recommendation system is less and most user feedback is not directly scored, the recommendation system mostly recommends based on implicit feedback, for example, recommends based on the behaviors of users such as attention, sharing, praise, watching duration, complete playing and the like. In this regard, a multi-task model may be used to predict a score for each task (e.g., attention, praise, or review), and finally, the scores may be fused in a weighted manner to obtain a final score, thereby completing the recommendation. Correspondingly, taking the recommendation of the information stream as an example, the training samples may include media sample data and user data, where the media sample data may include a short video, a picture, or a text stream that the user has viewed. The labels of the tasks corresponding to the training samples can be praise, share and completely play.

For example, for a short video sample, if the user approves, shares and watches the video for a predetermined period of time, the label of the training sample can be represented by (1,1,1); for another video sample, if the user only likes but not shares the video, and the viewing duration does not satisfy the predetermined duration, the label of the training sample can be represented by (1,0,0).

Based on the training samples and the labels, when the electronic equipment trains the multitask model, parameters of the multitask model can be adjusted according to the prediction result and the label calculation loss value. In step S12, the electronic device performs iterative training on the initial multi-task model by using the training sample data set until a convergence condition is met to obtain a trained multi-task model, where the process of each iterative training includes:

in step S12a, the electronic device performs feature extraction on each training sample by using an initial feature extraction layer in the initial multitask model, so as to obtain a first feature output by the initial feature extraction layer. In the embodiment of the present disclosure, the initial feature extraction layer may be the same as the aforementioned input layer, and the features in the input layer are expressions of feature information of the current training batch sample.

Taking an obstacle avoidance scene in the vehicle driving process as an example, the first features output by the initial feature extraction layer include some features capable of expressing the features of the obstacle, such as shape features and texture features of the obstacle, and may also include some features capable of expressing the driving environment, such as weather features and illumination intensity features.

Taking the recommendation of the information stream as an example, the first feature output by the initial feature extraction layer includes some features that can actually express the content, features, and the like of the media information, for example. For example, the classification, topic, content of the media information, income brought by the media information, etc. of the media information may also include some statistical features like click rate, approval rate, etc. In addition, the first feature output by the initial feature extraction layer may include, in addition to the feature of the media information, a feature of the user, for example, a basic attribute feature such as an age, a gender, and an occupation of the user, a preference feature, an account rating feature, and the like of the user.

It should be noted that the first feature output by the initial feature extraction layer is a relatively rough feature, the expression capability of the feature on the sample is not strong enough, and further feature optimization processing is required.

In the embodiment of the present disclosure, the initial multitasking model further includes an expert layer, and the expert layer is used for performing feature optimization processing. In step S12b, the electronic device performs feature processing on the first feature by using the expert layer to obtain a second feature output by the expert layer. The expert layer includes a plurality of feature processing layers, each of which includes a plurality of feature processing sub-networks, for example, the expert layer includes two or three feature processing layers, each of which includes two or three feature processing sub-networks, and the like. It should be noted that the greater the number of feature processing layers and the number of feature processing subnetworks, the better the feature optimization process may be, but at the same time, the speed of model convergence may also be affected.

In the embodiment of the disclosure, in different feature processing layers of the expert layer, the feature processing subnetworks in adjacent feature processing layers are connected in a one-to-one correspondence manner, that is, one feature processing subnetwork in the adjacent feature processing layer is only connected with another feature processing subnetwork in the adjacent feature processing layer. Each feature processing sub-network is configured to process an input feature and output the processed input feature, and it is understood that the second feature output by the expert layer may include the second feature output by each feature processing sub-network included in the feature processing layer at the highest level in the expert layer.

In the embodiment of the disclosure, the initial multitask model further includes task sub-networks corresponding to the tasks, and since each task in the multitask has different sensitivity to features, a task sub-network is separately set for each task to learn the characteristics of the task, thereby alleviating the occurrence of conflicts among the tasks. The task sub-network also belongs to the network used for feature optimization.

In step S12c, the electronic device obtains the prediction result of each training sample output by each task sub-network for the corresponding task based on the task sub-network corresponding to each task and the second feature output by the expert layer, that is, in step S12d, the parameters of the initial multi-task model are adjusted based on the prediction result of each training sample output by each task sub-network for the corresponding task and the label of each task corresponding to each training sample.

For example, based on the prediction result output by each task subnetwork and the label of each task, parameters including weight, deviation and the like are optimized by using a back propagation algorithm, gradient descent or random gradient descent and the like, and a multi-task model with high accuracy can be obtained through one or more times of parameter optimization adjustment.

Fig. 5 is a diagram illustrating a structure of a multitask model according to an embodiment of the present disclosure, and as shown in fig. 5, the multitask model includes an Input layer (Input layer), an expert layer (including 6 experts in the diagram, e1, e2, e3, ee1, ee2, and ee 3), and a tower layer (including tower 1, tower 2, and tower 3 corresponding to three tasks, respectively). Wherein, the input layer and the expert layer can be connected by a common full-connection network; the expert layer comprises two layers, namely e1 is connected with ee1, e2 is connected with ee2, and e3 is connected with ee3, and the features extracted by each expert in the lower layer are transmitted to an expert in the upper layer in a one-way manner; the upper layer is a tower layer, and the tower 1, the tower 2 and the tower 3 respectively correspond to a task sub-network. In the embodiment of the present disclosure, a simple full connection layer may be inside each task subnetwork, and the number and width of the full connection layers are determined by a modeler according to business characteristics, for example, 3 full connection layers may be adopted, and Feature fusion is performed in combination with a Feature Pyramid Network (FPN).

It can be understood that, in the embodiment of the present disclosure, in the feature processing layer at the lowest layer inside the expert layer connected to the initial feature extraction layer, each feature processing sub-network independently learns, and the feature processing sub-networks in each adjacent feature processing layer are connected in a one-to-one correspondence manner, on one hand, the independent learning expression capability of each expert is enhanced, so that the output of each feature sub-network in the feature processing layer at the higher layer can carry part of important feature information, which is convenient for providing different selection emphasis for different tasks (e.g., task sub-networks corresponding to tasks); on the other hand, modes are not shared in the expert layer, so that the parameter quantity is reduced, and the model training and convergence speed is higher.

In this disclosure, the second feature output by the expert layer may include a sub-feature output by each feature processing sub-network included in the feature processing layer located at the highest layer, and each task sub-network in the initial multi-task model may use a binary coding variable to filter each sub-feature output by the expert layer, where the binary coding variable is a variable of 0 or 1, and is used to control whether the underlying expert takes effect on the upper layer expert, like the parameter based on the idea of the coding layer. And the task sub-network screens each sub-characteristic at least by using the binary coding variables, so that a third characteristic corresponding to the task sub-network can be obtained.

The third feature may be a feature obtained by connecting each sub-feature screened by the conversion matrix obtained by matrix operation conversion of the neural network after the task sub-network screens each sub-feature based on the binary coding variable; or after the task sub-network filters each sub-feature based on the binary coding variables, performing weighted average based on the identity matrix to connect the filtered sub-features to obtain the features, which is not limited in the embodiment of the present disclosure.

In the embodiment of the present disclosure, the task subnetworks are used to learn the third features corresponding to the task subnetworks, so as to obtain the prediction result output by each task subnetwork.

It can be understood that, in the embodiment of the present disclosure, in the initial multitask model, the task sub-network corresponding to each task selects each sub-feature output by the expert layer based on the binary coding variables, and compared with a method of connecting features output by the expert layer based on weights, the method of the embodiment of the present disclosure makes the feature selection of the task sub-network more flexible, and different features can be selected in a targeted manner between tasks, so that a seesaw phenomenon between tasks can be effectively alleviated.

In the embodiment of the present disclosure, the weight matrix is the transformation matrix obtained by the matrix operation transformation of the neural network. And when the electronic equipment screens the sub-characteristics output by each characteristic processing sub-network included in the characteristic processing layer at the highest layer by using at least binary coding variables to obtain a third characteristic corresponding to the task sub-network, the electronic equipment also performs characteristic optimization processing by combining the weight matrix.

The following formula (1) is a formula of the feature connection between the tower layer (including a plurality of task sub-networks) and the expert layer in the embodiment of the present disclosure:

wherein u1, u2 and u3 represent sub-features output by three feature processing sub-networks included in the feature processing layer at the highest layer in the expert layer; w _ij A weight matrix representing that the ith feature processing sub-network included in the feature processing layer at the highest layer in the expert layer is connected to the jth task sub-network of the tower layer; z is a linear or branched member _ij Representing binary coding variables, and controlling whether the feature processing sub-network is connected with the task sub-network by 0 or 1; v1 and v2 represent the transformed result, i.e. the third feature corresponding to each task sub-network.

It should be noted that, in the embodiments of the present disclosure, Z _ij And W _ij Are parameters that can be updated based on gradient backtracking, and the training of the multitask model includes the training of the above parameters.

It can be understood that, in the embodiment of the present disclosure, the electronic device performs screening and weighting processing on each sub-feature output by the expert layer by using the binary coding variables and the weight matrix, so that the third feature obtained by the task sub-network is better, and thus, convergence of the multitask model can be accelerated, and the processing accuracy of the trained multitask model is higher.

In some embodiments, the expert layer includes a first feature processing layer and a second feature processing layer, and the performing feature processing on the first feature by using the expert layer in the initial multitasking model to obtain a second feature output by the expert layer includes:

respectively performing weighting processing and nonlinear mapping on the first feature set by using each feature processing sub-network included in the first feature processing layer to obtain intermediate features output by each feature processing sub-network in the first feature processing layer;

In the embodiment of the present disclosure, the expert layer includes two feature processing layers, namely, a first feature processing layer and a second feature processing layer. Each feature processing sub-network included in each feature processing layer independently performs weighting processing and nonlinear mapping on the input features to obtain better feature output.

For example, as shown in fig. 5, e1, e2, and e3 respectively perform weight transformation and nonlinear mapping on the features (i.e., the first features) output by the input layer, extract information further to obtain intermediate features, and then ee1 performs weight transformation and nonlinear mapping on the intermediate features output by the e1 layer to obtain optimized sub-features corresponding to ee 1. Similarly, ee2 and ee3 can be obtained by performing the above operation on the intermediate features of the e2 and e3 layers, and the second feature of the tertiary expert layer includes sub-features corresponding to ee1, ee2 and ee3 respectively.

It can be understood that, in the embodiment of the present disclosure, each feature processing sub-network in the expert layer performs weighting processing and nonlinear mapping independently, so that the feature expression capability of each feature processing sub-network is stronger, the convergence speed of the model can be increased, and the processing accuracy of the trained multi-task model is higher.

adjusting parameters of the initial multitasking model based on the total loss.

In the embodiment of the disclosure, the loss value of each type of task is calculated according to the difference between the prediction result of the training sample for each type of task and the label. For example, the loss value corresponding to each type of task may be calculated based on a logistic regression or a mean square error method, and a specific implementation manner of calculating the loss value of each type of task is not limited in the embodiment of the present disclosure.

It should be noted that the loss value of each type of task is based on all training samples in the training sample data set, and the loss value of a single task may be represented as an average value of differences between prediction results and tag values of all training samples in the training sample data set, and the smaller the loss value, the smaller the representation difference. Correspondingly, based on the loss value of each type of task, the smaller the determined total loss is, the better the convergence degree of the model, i.e., the accuracy of the model is, so that the embodiment of the disclosure can adjust the parameters of the initial multi-tasking model based on the total loss to obtain the multi-tasking model with better accuracy compared with the initial multi-tasking model.

In the embodiment of the present disclosure, when determining the total loss based on the loss value of each type of task, the total loss may be determined based on a predetermined loss function, with the loss value of each type of task as an independent variable and the total loss as a dependent variable. For example, the sum of the loss values of various types of tasks may be taken as the total loss; different weights can be allocated to each type of task and then summed, and the sum value is used as the total loss.

For example, the loss function for calculating the total loss can be given by the following equation (2):

loss＝0.5*loss1+0.3*loss2+0.2*loss3 (2)

where loss on the left side of the equation is the total loss; loss1 on the right side of the equation can be the corresponding loss of task 1 in fig. five, and 0.5 is the corresponding weight of task 1; loss2 may be a loss corresponding to task 2 in fig. five, and 0.3 is a weight corresponding to task 2; loss3 may correspond to task 3 in fig. five, and 0.2 may correspond to task 3 in fig. five.

It can be understood that, in the embodiment of the present disclosure, the total loss is determined based on the loss of each type of task, and the parameters in the initial multi-task model are adjusted based on the total loss, instead of adjusting the parameters based on a single task, so that the multi-task model can balance the processing of different tasks, and the multi-task model obtained by training can obtain a more comprehensive and accurate multi-task recognition result in a way of training the multi-task as a whole.

In the embodiment of the disclosure, the trained multi-task model is based on, that is, the trained multi-task model can be applied to the obstacle avoidance scene, the recommendation system, and the like in the vehicle driving process, but the application is not disclosed and is not limited to the above two scenes. Taking an obstacle avoidance scene in the vehicle driving process as an example, based on a trained multi-task model, aiming at a current shot environment image in which the vehicle drives, the vehicle is controlled to react in time according to a positioning prediction result of a central point of an obstacle, a positioning prediction result of an angular point of the obstacle, a prediction result of a boundary frame of the obstacle and a prediction result of driving environment characteristics, such as a circumscribed rectangular frame of the obstacle, characteristic points (obstacle type, obstacle size and the like) of the obstacle and the current driving environment, so that the possibility of accidents is reduced.

Taking an example of applying a trained multitask model to information recommendation, fig. 6 is a flowchart of an information recommendation method provided by the embodiment of the present disclosure, and as shown in fig. 6, the information recommendation method includes the following steps:

s21, obtaining user information of a target user and media information to be recommended;

s22, processing the user information of the target user and each piece of media information by using a multitask model to obtain the probability of each behavior of the target user on each piece of media information; the multitask model is obtained by training based on the method;

s23, determining the interest degree of the target user in each piece of media information based on the probability of each action of the target user on each piece of media information;

s24, ranking the interest degree of each piece of media information, and determining target media information recommended to the target user from each piece of media information according to a ranking result.

In the embodiment of the present disclosure, the information recommendation method may be applied to the information recommendation apparatus in the embodiment of the present disclosure, and the apparatus may be configured in an electronic device, for example, a server device, and the server device may be a server device that performs multitask model training, or another server device. If the server equipment is other server equipment for training the multitask model, the trained multitask model needs to be arranged on the server equipment in advance.

In step S21, the electronic device may obtain user information of a target user and media information to be recommended, where the target user may refer to a user currently logging in a certain application, and the user information of the target user may include account information (including an account name, an account level, and the like) of the target user; identity attribute information of the target user such as age, gender and occupation; preference information of the user and the like may also be included. The media information to be recommended is, for example, some videos, audios, pictures, or text contents, or the like, or media information merged in a form of multiple files, for example, videos added with text descriptions, and the like.

In step S22, the electronic device inputs the user information of the target user and each piece of media information into the multitasking model, and obtains the probability of each action of the target user on each piece of media information. For example, the probability of the target user agreeing on a certain video, the probability of sharing, or the probability of complete playing is obtained, and each action corresponds to each task mentioned in the embodiment of the present disclosure.

In the embodiment of the present disclosure, the multitask model may be a model including the behaviors, and the training method used for obtaining the training sample including the user data and the media sample data by using the model structure shown in fig. 5 includes a method taking the steps shown in fig. 4 as a core idea, where each media sample data is identified with each behavior tag corresponding to a historical behavior based on the user.

It should be noted that, in some embodiments, the user data in the training sample may be data including all historical users, and the corresponding media sample data may be media sample data that all historical users have viewed. During model training, the electronic device performing model training may extract identity attribute information of all historical users, for example, features (interested subjects, presentation forms, and the like) of media information that is interested in by the user who learns the identity attribute features; when the model is applied, the electronic equipment for information recommendation can determine the identity of the target user according to the user information of the current target user and recommend the user according to the characteristics of the media information which is interested by the user and has the learned identity attribute characteristics.

In other embodiments, the user data in the training sample may also be only the user data of the target user, and the corresponding media sample data may be the media sample data that has been viewed by the target user. When the model is trained, the electronic equipment for model training can show the characteristics of the form and the like according to the media sample data of the target user, such as the media content which the learning target user is interested in; when the model is applied, the electronic device for information recommendation can find out the multitask model corresponding to the target user according to the user information of the current target user and recommend the multitask model according to the characteristics of the learned media information which is interested by the target user.

In step S23, after obtaining the probability of each action of the target user on each media information, the electronic device may determine the probability of the target user interested in each media information based on the probability of each action corresponding to each media information.

It should be noted that, in some embodiments, the electronic device may directly determine the sum of the probabilities corresponding to the media information as the degree of interest of the target user in the media information, and the greater the sum of the probabilities, the greater the degree of interest.

For example, if the probability of the target user agreeing with the media information a is 0.3, the probability of sharing is 0.5, and the probability of completely playing is 0.8, the degree of interest of the target user in the media information is 1.6.

In other embodiments, the electronic device may directly perform weighted summation on the probabilities corresponding to the media information based on the weights corresponding to the tasks, and determine a weighted sum as a degree of interest of the target user in the media information, where the greater the weighted sum, the greater the degree of interest.

Illustratively, sharing may bring more benefits, and full play may indicate that the user is more interested in the media information to some extent, so the praise weight may be set to 0.1, the share weight is 0.4, the full play weight is 0.5, and correspondingly, the interest degree of the target user in the media information is 0.63 (0.1 × 0.3+0.4 +0.5 + 0.8).

In step S24, the electronic device may rank the interest levels of the media information, and determine the target media information recommended to the target user from the media information according to the ranking result, for example, recommending a preset number of media information in the order from high to low of the interest levels.

It can be understood that, in the embodiment of the present disclosure, the electronic device for information recommendation uses the embodiment of the present disclosure to propose the multitask model to perform information recommendation, and since each feature processing sub-network in the expert layer of the multitask model learns independently and the feature processing sub-networks in each adjacent feature processing layer are connected in a one-to-one correspondence manner, the information recommendation performed based on the multitask model can be improved in recommendation accuracy and recommendation speed.

FIG. 7 is a diagram illustrating a training apparatus for a multitask model according to an exemplary embodiment. Referring to fig. 7, the apparatus includes:

a first obtaining module 101 configured to obtain a training sample data set; the training sample data set comprises a plurality of training samples relevant to each task and labels of each task corresponding to each training sample;

the training module 102 is configured to perform iterative training on the initial multitask model by using the sample data training set until a convergence condition is met to obtain a trained multitask model, and the process of each iterative training includes:

a feature extraction module 102a, configured to perform feature extraction on each training sample in the training sample data set by using an initial feature extraction layer in the initial multitask model to obtain a first feature output by the initial feature extraction layer;

the feature processing module 102b is configured to perform feature processing on the first features by using an expert layer in the initial multitask model to obtain second features output by the expert layer; the expert layer comprises a plurality of feature processing layers, one feature processing layer comprises a plurality of feature processing sub-networks, the feature processing sub-networks in the adjacent feature processing layers are connected in a one-to-one correspondence manner, and each feature processing sub-network is used for processing and outputting input features;

the prediction module 102c is configured to obtain a prediction result of each training sample output by each task sub-network for a corresponding task based on the second feature output by the expert layer and the task sub-network corresponding to each task in the initial multi-task model;

the parameter adjusting module 102d is configured to adjust parameters of the initial multi-task model based on a prediction result of each training sample output by each task sub-network for a corresponding task and a label of each task corresponding to each training sample in the training sample data set.

the prediction module 102c is further configured to, for each task sub-network in the initial multi-task model, at least utilize a binary coding variable to filter sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer, so as to obtain third features corresponding to the task sub-networks; and utilizing each task sub-network to respectively learn the third characteristics corresponding to the task sub-networks, so as to obtain the prediction result of each training sample output by each task sub-network for the corresponding task.

In some embodiments, the predicting module 102c is further configured to perform screening and weighting processing on the sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer by using the binary coding variables and the weight matrix, so as to obtain a third feature corresponding to the task sub-network.

In some embodiments, the expert layer includes a first feature processing layer and a second feature processing layer, and the feature processing module 102b is further configured to perform weighting processing and nonlinear mapping on the first feature set by using each feature processing sub-network included in the first feature processing layer, respectively, to obtain an intermediate feature output by each feature processing sub-network in the first feature processing layer; and performing weighting processing and nonlinear mapping on the intermediate features output by the corresponding feature processing sub-networks in the first feature processing layer by using each feature processing sub-network included in the second feature processing layer to obtain the second features output by the expert layer.

In some embodiments, the parameter adjusting module 102d is further configured to calculate, for each of the tasks, a loss value of the task based on a prediction result of each training sample output by a task sub-network of the task for the task and a label of each training sample in the training sample data set for the task; determining a total loss of the initial multi-tasking model based on the loss value of each task; adjusting parameters of the initial multitasking model based on the total loss.

Fig. 8 is a diagram illustrating an information recommendation device according to an example embodiment. Referring to fig. 8, the apparatus includes:

a second obtaining module 201, configured to obtain user information of a target user and media information to be recommended;

a processing module 202, configured to process the user information of the target user and each piece of media information by using a multitask model, so as to obtain a probability that each behavior occurs on each piece of media information by the target user; wherein the multitask model is obtained based on the training of the method;

a determining module 203, configured to determine a degree of interest of the target user in each piece of media information based on a probability of each action of the target user on each piece of media information;

the recommending module 204 is configured to rank the interest degree of each piece of media information, and determine target media information recommended to the target user from each piece of media information according to a ranking result.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 9 is a block diagram illustrating a server apparatus 900 according to an example embodiment. The server can be a server applied to multi-task model training or a server applied to information recommendation.

Referring to fig. 9, the apparatus 900 includes a processing component 922, which further includes one or more processors and memory resources, represented by memory 932, for storing instructions, such as applications, that may be executed by the processing component 922. The application programs stored in memory 932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 922 is configured to execute instructions to perform the network distribution method described above.

The device 900 may also include a power component 926 configured to perform power management of the device 900, a wired or wireless network interface 950 configured to connect the device 900 to a network, and an input output (I/O) interface 958. The apparatus 900 may operate based on an operating system stored in the memory 932, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided that includes instructions, such as the memory 932 that includes instructions, that are executable by the processing component 922 of the apparatus 900 to perform the above-described method. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer readable storage medium having instructions therein, which when executed by a processing component of a server, enable the server to perform a method of training a multitasking model, the method comprising:

performing feature extraction on each training sample in the training sample data set by using an initial feature extraction layer in an initial multi-task model to obtain first features output by the initial feature extraction layer;

The instructions in the storage medium, when executed by a processing component of a server, enable the server to further perform an information recommendation method, the method comprising:

processing the user information of the target user and each piece of media information by using a multitask model to obtain the probability of each behavior of the target user on each piece of media information; wherein the multitask model is obtained based on the training of the method;

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for training a multitask model, the method comprising:

acquiring a training sample data set; the training sample data set comprises a plurality of training samples relevant to each task and labels of each task corresponding to each training sample;

2. The method of claim 1, wherein the second feature output by the expert layer comprises: the sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer in the expert layers;

3. The method according to claim 2, wherein the screening, by using at least binary coding variables, the sub-features output by each feature processing sub-network included in the feature processing layer of the highest layer to obtain a third feature corresponding to a task sub-network comprises:

4. The method according to any one of claims 1 to 3, wherein the expert layer comprises a first feature processing layer and a second feature processing layer, and the feature processing of the first feature by the expert layer in the initial multitask model to obtain a second feature output by the expert layer comprises:

5. The method of claim 1, wherein the adjusting parameters of the initial multi-task model based on the predicted result of each training sample output by each task sub-network for the corresponding task and the label of each task corresponding to each training sample in the training sample data set comprises:

adjusting parameters of the initial multitasking model based on the total loss.

6. An information recommendation method, characterized in that the method comprises:

processing the user information of the target user and each piece of media information by using a multitask model to obtain the probability of each behavior of the target user on each piece of media information; wherein the multitask model is obtained based on the method training of any one of claims 1-5;

7. An apparatus for training a multitask model, the apparatus comprising:

the characteristic processing module is configured to perform characteristic processing on the first characteristic by utilizing an expert layer in the initial multitask model to obtain a second characteristic output by the expert layer; the expert layer comprises a plurality of feature processing layers, one feature processing layer comprises a plurality of feature processing sub-networks, the feature processing sub-networks in the adjacent feature processing layers are connected in a one-to-one correspondence manner, and each feature processing sub-network is used for processing and outputting input features;

8. An information recommendation apparatus, characterized in that the apparatus comprises:

the processing module is configured to process the user information of the target user and each piece of media information by using a multitask model to obtain the probability of each behavior of the target user on each piece of media information; wherein the multitask model is obtained by training based on the method of any one of claims 1-5;

9. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform a method of training a multitask model according to any one of claims 1 to 5 or a method of information recommendation according to claim 6.

10. A non-transitory computer-readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of training a multitask model according to any one of claims 1 to 5 or the method of information recommendation according to claim 6.