CN115545114A

CN115545114A - Training method of multi-task model, content recommendation method and device

Info

Publication number: CN115545114A
Application number: CN202211293317.XA
Authority: CN
Inventors: 林中平
Original assignee: Weimeng Chuangke Network Technology China Co Ltd
Current assignee: Weimeng Chuangke Network Technology China Co Ltd
Priority date: 2022-10-21
Filing date: 2022-10-21
Publication date: 2022-12-30

Abstract

The embodiment of the application provides a training method of a multitask model, a content recommendation method and a content recommendation device, wherein the training method of the multitask model comprises the following steps: acquiring a training sample, wherein the training sample at least comprises user behavior data, user characteristics and historical browsing data corresponding to the user behavior data; determining interest vector representation of a target user according to user behavior data, historical browsing data and user characteristics; and inputting the interest vector representation into the multi-task model to be trained for training until a target loss function of the multi-task model to be trained is converged, so as to obtain the trained multi-task model.

Description

Training method of multi-task model, content recommendation method and device

Technical Field

The invention relates to the technical field of internet, in particular to a training method of a multitask model, a content recommendation method and a content recommendation device.

Background

In the information flow recommendation system, the interactive behaviors of clicking, interacting and the like of a user and a product and the watching duration of the user in the information flow reflect the quality of user experience. When the information flow recommendation system pushes products to users, potential interest tendency of the users to the products is predicted from multiple angles, so that the products are comprehensively sorted and accurately delivered.

In the related technology, three independent models are usually designed according to the click rate, the interaction rate and the watching duration of a user and a product, after the three models are fully trained, the three models are deployed at a server of an information flow recommendation system, the prediction scores of new products in the three models are output in real time, and recommended contents are presented to the user after comprehensive sequencing. Therefore, models of different tasks are trained independently, the training efficiency is low, common resources cannot be shared among the models, resource waste is caused, and the accuracy of recommending contents to a user by each model obtained through training of click rate, interaction rate and watching duration data of the user and a product is low.

Disclosure of Invention

The embodiment of the application aims to provide a training method of a multitask model, a content recommendation method and a content recommendation device, so that resource waste is reduced, and the training efficiency of the model and the accuracy of content recommended to a user are improved.

In order to solve the above technical problem, the embodiment of the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides a method for training a multitask model, where the method for training includes: acquiring a training sample, wherein the training sample at least comprises user behavior data, user characteristics and historical browsing data corresponding to the user behavior data; determining interest vector representation of a target user according to the user behavior data, the historical browsing data and the user characteristics; inputting the interest vector representation into a multi-task model to be trained for training until a target loss function of the multi-task model to be trained is converged to obtain a trained multi-task model; the multitask model comprises a sharing layer and a plurality of subtask models, wherein the sharing layer enables content sharing among the subtask models, the training samples serve as input of the subtask models, each subtask model outputs a predicted value aiming at a training target of each subtask model according to user behavior data, user characteristics and historical browsing data corresponding to the user behavior data in the training samples, and the training target comprises at least one of click rate, interaction rate and watching duration aiming at candidate content to be watched.

In a second aspect, an embodiment of the present application provides a content recommendation method, where the content recommendation method includes: acquiring browsing data and user characteristics of a target user; inputting the browsing data and the user characteristics into a multitask model for content prediction to obtain prediction information output by the multitask model, wherein the multitask model is obtained by training according to the training method of the multitask model mentioned in the first aspect; and recommending target content corresponding to the prediction information for the target user according to the indication of the prediction information.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus; the processor, the communication interface and the memory complete mutual communication through a bus; the memory is used for storing a computer program; the processor is configured to execute the program stored in the memory to implement the steps of the first aspect or the second aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method steps according to the first or second aspect.

In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method steps according to the first aspect or the second aspect.

According to the technical scheme provided by the embodiment of the application, the training sample is obtained and at least comprises user behavior data, user characteristics and historical browsing data corresponding to the user behavior data; determining interest vector representation of a target user according to user behavior data, historical browsing data and user characteristics; inputting the interest vector representation into the multi-task model to be trained for training until a target loss function of the multi-task model to be trained is converged to obtain a trained multi-task model; the multitask model comprises a sharing layer and a plurality of subtask models, the sharing layer enables content sharing among the subtask models, a training sample serves as input of each subtask model, each subtask model outputs a predicted value aiming at a training target of each subtask model according to user behavior data, user characteristics and historical browsing data corresponding to the user behavior data in the training sample, and the training target comprises at least one of click rate, interaction rate and watching duration aiming at candidate content to be watched. In addition, the interest vector representation of the user can be determined according to the user behavior data, the user characteristics and the historical browsing data, the multi-task model is trained by using the interest vector representation, the multi-task model to be trained learns the interest of the user according to the interest vector representation, the multi-task model obtained by training can recommend the content to the user according to the interest of the user, and the accuracy of recommending the content to the user by the model is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic flowchart of a method for training a multitask model according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a content recommendation method according to an embodiment of the present application;

FIG. 3 is a functional block diagram of a training apparatus for a multitask model according to an embodiment of the present application;

fig. 4 is a schematic functional module diagram of a content recommendation device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the related technology, three independent models are usually designed according to the click rate, the interaction rate and the watching duration of a user and a product, after the three models are fully trained, the three models are deployed at a server of an information flow recommendation system, the prediction scores of the new product in the three models are output in real time, and the recommended content is presented to the user after comprehensive sequencing. Therefore, each model of different tasks is independently trained, the training efficiency is low, some general resources cannot be shared among the models, resource waste is caused, and the interest and hobbies hidden by the user cannot be known through each model obtained by training the click rate, the interaction rate and the watching time length data of the user and the product, so that the accuracy of recommending contents to the user is low.

In order to solve the above technical problems, embodiments of the present application provide a method for training a multitask model, a method for recommending content, and an apparatus for recommending content, and the method for training a multitask model, the method for recommending content, and the apparatus for recommending content provided by embodiments of the present application are described in detail below with reference to the accompanying drawings.

As shown in fig. 1, an execution subject of the method may be a server, where the server may be an independent server or a server cluster composed of multiple servers. The training method of the multitask model specifically comprises the following steps S101-S105:

in step S101, training samples are acquired.

The training samples at least comprise user behavior data, user characteristics and historical browsing data corresponding to the user behavior data.

Specifically, the training sample is used for training the multitask model on the server, one training sample at least comprises user characteristics, user behavior data and historical browsing data, and the user behavior data comprises the condition of whether a user clicks, the condition of whether the user interacts with and watches the three targets and the target behavior sequence. Wherein, the user characteristics include but are not limited to information such as age, gender, occupation, name, etc.; the historical browsing data includes but is not limited to the type, subject, author information and the like of the content browsed by the user; the target behavior sequence is content which is interacted by a user recently, the target behavior sequence can be in multiple angles, for example, a user click sequence, a user interaction sequence, a video sequence watched by the user and the like, the behavior sequence of each angle corresponds to historical browsing data, and the characteristics of the user behavior data, the characteristics of the user and the characteristics of the historical browsing data corresponding to the user behavior data are subjected to Hash mapping uniformly, so that the corresponding relation among the user behavior data, the characteristics of the user and the historical browsing data is established. For each feature, a unique slot corresponds to each slot, a slot number is represented by 10-bit binary codes, a value of the feature is subjected to Hash mapping to form 54-bit binary codes, two groups of codes are spliced to form 64-bit codes to represent a combination of feature-feature values, after the unique Hash coding is carried out, the 64-bit binary codes are converted into decimal index numbers, and the feature and the corresponding value (corresponding data) are mapped to a low-dimensional dense vector together through a large-scale embedded matrix table look-up, so that the time complexity and the space complexity of a subsequent multi-task model can be reduced, and the vector dimension can be set to be 16 in the embodiment.

In step S103, an interest vector representation of the target user is determined according to the user behavior data, the historical browsing data and the user characteristics.

Specifically, the characterization of the interest vector of the user refers to characterizing an interest orientation of the user, which represents content of interest of the user, and may specifically determine the characterization of the interest vector of the target user by using a target behavior sequence of the user, where the target behavior includes at least one of a click behavior, a user interaction behavior, and a user video watching behavior, different target behavior sequences correspond to different historical browsing data, and determining the characterization of the interest vector of the target user according to the user behavior data, the historical browsing data, and the user characteristic includes:

extracting characteristics of historical browsing data of a user target behavior sequence based on a self-attention mechanism model to obtain a plurality of interest representations of the target behavior sequence of the target user corresponding to the user characteristics; and performing weighted fusion on the interest representations to obtain the interest vector representations.

Specifically, in the embodiment of the present application, a layer of encoder and decoder of a self-attention mechanism model (transform) may be adopted, and when determining the above interest vector characterization, historical browsing data of the user features, the user target behavior sequence, and the user target behavior sequence may be mapped into a low-dimensional matrix; determining the embedded representation of the target behavior sequence of the target user according to the low-dimensional matrix; determining a new characterization of a target behavior sequence of the target user by using a low-dimensional matrix and an embedded characterization based on a self-attention mechanism model; calculating the association degree of the new characterization and the candidate content; and carrying out linear weighting on the embedded representation of the target behavior sequence of the target user according to the relevance to obtain an interest representation.

Specifically, a low latitude embedding mode is adopted to map the user characteristics, the user target behavior characteristic sequence and the historical browsing data into a low-dimensional matrix, so that a matrix B with the length of L and the column of embedding dimension d is obtained, wherein B belongs to R ^L*d In the embodiment of the application, a coding-decoding structure of a self-attention mechanism model is adopted to extract features contained in a user target behavior sequence, in order to improve the learning speed of a multitask model, only one layer of decoder and encoder is used, the encoder part is used for outputting a new feature of the target behavior sequence of a target user, and the decoder part is used for outputting an interest feature of the target user.

More specifically, for the encoder portion, the new characterization is calculated in the following manner:

firstly, calculating the product of the low-dimensional matrix and different transformation matrices to obtain a transformed low-dimensional matrix, and specifically adopting the following formula:

Q＝BW _Q

K＝BW _K

V＝BW _V

wherein Q, K and V denote transformed low dimensional matrices, W _Q 、W _K 、W _V Different transformation matrices are shown, and B is a low-dimensional matrix.

Q, K and V as described above can be used

It is shown that the process of the present invention,

represents the embedded characterization after linear transformation, wherein L represents the length of the target behavior sequence, d _k The dimensions of the embedded tokens are represented.

After obtaining the embedded characterization, calculating a similarity matrix a by using a low-dimensional matrix and the embedded characterization based on the Attention mechanism model Attention, which is specifically represented by the following formula:

wherein, K ^T The transpose of the matrix K is shown.

After the similarity matrix is obtained, a new characterization is calculated by using the following formula, wherein the Output of the encoder is Output _encoder Represented is a new characterization of each content in the user target behavior sequence weighted with other content attentions:

Output _encoder ＝a*V

for the part of the decoder, the user calculates the interest characterization as follows:

calculating the association degree e between the new characterization and the candidate content at the decoder end, specifically adopting the following formula:

e＝softmax(Output _encoder *c ^T )

c ^T a transpose of the matrix of candidate content is represented.

Taking the calculated association degree e as a weight, and performing linear weighting on the embedded representation of the target behavior sequence of the target user to obtain the interest representation of the target user, wherein the following formula is specifically adopted:

wherein e is _i Representing the ith degree of association, b _i The ith embedded token is represented.

Therefore, the interest representation of the user target behavior sequence under different target behaviors of the user is obtained according to the mode.

According to different target behaviors, the target behavior sequence comprises a user click behavior sequence, a user interaction behavior sequence and a user video watching behavior sequence, and the interest representations obtained according to the above modes are respectively marked as Vec _click 、Vec _interact And Vec _view 。

In order to effectively fuse interest representations of different behavior sequences of a user, an attention mechanism (gate mechanism) can be firstly adopted to calculate the weight G of each interest representation _out Specifically, the formula is as follows:

G _out ＝softmax(concat(Vec _click ,Vec _interact ,Vec _view )*W _project )

wherein concat () represents the vector splicing operator, W _project A learnable parameter matrix is represented.

Then, a final interest vector characterization Vec of the user is obtained in a weighted fusion mode _final Specifically, the formula is as follows:

in step S105, the interest vector representation is input into the multi-task model to be trained for training until the target loss function of the multi-task model to be trained converges, so as to obtain the trained multi-task model.

The multitask model comprises a sharing layer and a plurality of subtask models, the sharing layer enables content among the subtask models to be shared, training samples serve as input of the subtask models, the subtask models output predicted values aiming at training targets of the subtask models according to user behavior data, user characteristics and historical browsing data corresponding to the user behavior data in the training samples, the training targets comprise at least one of click rate, interaction rate and watching duration aiming at candidate content to be checked, the target loss function comprises loss functions of the subtask models, and the loss functions of the subtask models indicate the difference degree between the predicted values and the real values of the subtask models.

Specifically, the multitask model refers to modeling a plurality of subtasks (targets), a general structure of the multitask model generally comprises a sharing layer and independent towers, the sharing layer enables knowledge among different subtasks to be shared, the independent towers are consistent with the number of the subtasks, interest vector representations enter the multitask model and firstly pass through the sharing layer, then enter the tower layers of the respective subtasks, and finally predicted values of the different targets are output by the tower layers of the respective subtasks. In the embodiment of the application, the design of the underlying network structure of the multitask model uses a strategy of Deep Neural Networks (DNN) + network Routing (SNR), and the SNR strategy enables parameters of a shared layer part based on the DNN structure to dynamically perform knowledge migration among different subtasks (targets), so that flexible sharing of the parameters is realized.

Suppose W ∈ R ^M×N Is a shared layer parameter matrix, and the point multiplication of a binary matrix Z epsilon {0,1} with the same scale ^M×N And thinning the W matrix, and generating the Z matrix by using a heavy parameter skill and a hard-sigmoid strategy in order to differentiate Z in a loss function, wherein the specific calculation process is as follows:

S＝sigmoid(log(U)-log(1-U)+log(α)/β)

where U is a parameter matrix that is in-line with W, and α, β, ξ, γ are constants for generating the Z matrix. For example, in the embodiment of the application, the multitask model adopts a structure of 'DNN + SNR', and an SNR strategy is used from a shared layer to an independent tower layer of subtasks, so that the connection of part of neurons can be shielded, the similar sparse sharing characteristic is realized, the number of the tower layer is consistent with the number of the subtasks, and the objective is to independently learn the target of each subtask and output the predicted value of the target.

In a possible implementation manner, the plurality of subtask models include a plurality of subtask models including a click rate subtask model, an interaction rate subtask model, and a viewing time long subtask model, a cross entropy function is used for a loss function of the click rate subtask model and a loss function of the interaction rate subtask model, and a root mean square error function is used for a loss function of the viewing time long subtask model.

Specifically, the target loss function is a weighted sum of the loss function of the click-rate subtask model, the loss function of the interaction-rate subtask model, and the loss function of the viewing-time subtask model.

Further, considering that each subtask corresponds to a loss function, a fusion mode using the loss functions with the same weight may not enable each subtask model to achieve the optimal state, and the gradient magnitude of each subtask model is different, and the optimal effect cannot be achieved even when a fixed weight is used in the training period, therefore, in the embodiment of the present application, an uncertainty weighted learning strategy is adopted, and the weights of the loss functions of different subtask models are parameterized, that is, the target loss function is determined based on the uncertainty weighted learning strategy, where the target loss function is a loss function of a click rate subtask model, a loss function of an interaction rate subtask model, and a loss function of a long subtask model during viewing, and the target loss function further includes: a sum of the weighted sum and a logarithmic form of the parameter in the uncertain weighting learning strategy.

More specifically, the objective loss function of the multitask model can be expressed by the following formula:

wherein σ ₁ To sigma ₃ Representing trainable parameters of different subtasks, loss _ctr Expressed is the loss function, loss of the click rate subtask model _interact Expressed is the loss function, loss, of the interaction rate subtask model _time The penalty function of the viewing time long subtask model is shown.

According to the technical scheme disclosed by the embodiment of the application, training samples are obtained, and the training samples at least comprise user behavior data, user characteristics and historical browsing data corresponding to the user behavior data; determining interest vector representation of a target user according to user behavior data, historical browsing data and user characteristics; inputting the interest vector representation into the multi-task model to be trained for training until a target loss function of the multi-task model to be trained is converged to obtain a trained multi-task model; the multitask model comprises a sharing layer and a plurality of subtask models, the sharing layer enables content sharing among the subtask models, a training sample serves as input of each subtask model, each subtask model outputs a predicted value aiming at a training target of each subtask model according to user behavior data, user characteristics and historical browsing data corresponding to the user behavior data in the training sample, and the training target comprises at least one of click rate, interaction rate and watching duration aiming at candidate content to be watched. In addition, the interest vector representation of the user can be determined according to the user behavior data, the user characteristics and the historical browsing data, the multi-task model is trained by using the interest vector representation, so that the multi-task model to be trained learns the interest orientation of the user according to the interest vector representation, the multi-task model obtained by training can recommend content to the user by combining the interest orientation of the user, and the accuracy of the model in recommending the content to the user is improved.

As shown in fig. 2, an execution subject of the method may be a server, where the server may be an independent server, or a server cluster composed of multiple servers. The content recommendation method may specifically include the following steps S201 to S205:

step S201, acquiring browsing data and user characteristic information of a target user.

Specifically, the browsing data of the target user includes, but is not limited to, the contents clicked, interacted, or viewed by the user, and the click sequence, the interaction sequence, and the viewing sequence corresponding to the contents, and the user characteristics include, but are not limited to, the information of age, gender, occupation, name, and the like.

Step S203, inputting the browsing data and the user characteristics into the multitask model for content prediction to obtain prediction information output by the multitask model.

Specifically, the multitask model is trained according to the above embodiment, and the prediction information includes predicted values of the click rate, interaction rate, and viewing time length of the candidate content.

And step S205, recommending target content corresponding to the prediction information for the target user according to the indication of the prediction information.

After the browsing data and the user characteristics are input into the multi-task model for content prediction, predicted values of the click rate, the interaction rate and the watching duration of the candidate content output by each task sub-model are output, wherein the number of the candidate content is multiple, the candidate content with the highest ranking score can be determined from the click rate, the interaction rate and the watching duration to be the target content, and the target content is recommended to the target user. When the ordering of the predicted values of the click rate, the interaction rate and the watching duration of the same candidate content is in different ordering positions, the candidate content with the highest ordering score is determined according to the click rate, the interaction rate and the predicted value of the watching duration of the candidate content, the candidate content with the highest ordering branch is recommended to a target user, specifically, the click rate, the interaction rate and the watching duration can be subjected to weighted summation to obtain a final numerical value, the numerical values obtained by the weighted summation are ordered, and the candidate content corresponding to the numerical value with the highest ordering is selected as the target content.

By the technical scheme disclosed by the embodiment of the application, the general content among all the subtask models of the multitask model can be shared, resource waste is avoided, the multitask model can recommend the content to the user according to the interest of the user, and the accuracy of recommending the content to the user by the model is improved.

Corresponding to the training method of the multitask model provided in the above embodiment, based on the same technical concept, an embodiment of the present application further provides a training device of the multitask model, fig. 3 is a schematic diagram of modules of the training device of the multitask model provided in the embodiment of the present application, the training device of the multitask model is used for executing the training method of the multitask model described in the above embodiment, and as shown in fig. 3, the training device 300 of the multitask model includes: an obtaining module 301, configured to obtain a training sample, where the training sample at least includes user behavior data, user characteristics, and historical browsing data corresponding to the user behavior data; a determining module 302, configured to determine an interest vector representation of a target user according to user behavior data, historical browsing data, and user characteristics; the training module 303 is configured to input the interest vector representation into the multi-task model to be trained for training until a target loss function of the multi-task model to be trained converges, so as to obtain a trained multi-task model; the multitask model comprises a sharing layer and a plurality of subtask models, the sharing layer enables content sharing among the subtask models, a training sample serves as input of each subtask model, each subtask model outputs a predicted value aiming at a training target of each subtask model according to user behavior data, user characteristics and historical browsing data corresponding to the user behavior data in the training sample, and the training target comprises at least one of click rate, interaction rate and watching duration aiming at candidate content to be watched.

According to the technical scheme disclosed by the embodiment of the application, general contents among all subtask models of the multitask model can be shared, single models of different tasks do not need to be trained independently, training efficiency is improved, resource waste is avoided, in addition, the interest vector representation of the user can be determined according to user behavior data, user characteristics and historical browsing data, the multitask model is trained by utilizing the interest vector representation, the multitask model to be trained can learn the interest of the user according to the interest vector representation, the content can be recommended to the user by the trained multitask model in combination with the interest of the user, and the accuracy of the content recommended to the user by the model is improved.

In a possible implementation manner, the user behavior data includes a user target behavior sequence, the target behavior includes at least one of a click behavior, a user interaction behavior, and a video watching behavior, different target behavior sequences correspond to different historical browsing data, and the determining module 302 is further configured to perform feature extraction on the historical browsing data corresponding to the user target behavior sequence based on the self-attention mechanism model to obtain a plurality of interest representations of the target behavior sequence of the target user corresponding to the user features; and carrying out weighted fusion on the interest representations to obtain the interest vector representations.

In a possible implementation manner, the determining module 302 is further configured to map the user characteristics, the user target behavior sequence, and historical browsing data corresponding to the user target behavior sequence into a low-dimensional matrix; determining the embedded representation of the target behavior sequence of the target user according to the low-dimensional matrix; determining a new characterization of a target behavior sequence of the target user by using a low-dimensional matrix and an embedded characterization based on a self-attention mechanism model; calculating the association degree of the new representation and the candidate content to be viewed; and carrying out linear weighting on the embedded representation of the target behavior sequence of the target user according to the relevance to obtain an interest representation.

In one possible implementation manner, the plurality of subtask models include a click rate subtask model, an interaction rate subtask model, and a viewing time long subtask model, where a loss function of the click rate subtask model and a loss function of the interaction rate subtask model use a cross entropy function, and a loss function of the viewing time long subtask model uses a root mean square error function.

In one possible implementation, the target loss function is a weighted sum of the loss function of the click rate subtask model, the loss function of the interaction rate subtask model, and the loss function of the view time subtask model.

In one possible implementation, the weights of the target loss function of the click rate subtask model, the interaction rate subtask model, and the viewing duration subtask model are determined based on an uncertainty weighted learning strategy, and the target loss function further includes: a sum of the weighted sum and a logarithmic form of the parameter in the uncertain weighting learning strategy.

The training device for the multitask model provided by the embodiment of the application can realize each process in the embodiment corresponding to the training method for the multitask model, and is not repeated here for avoiding repetition.

It should be noted that the training apparatus for the multitask model provided in the embodiment of the present application and the training method for the multitask model provided in the embodiment of the present application are based on the same inventive concept and have the same technical effect, so that reference may be made to the implementation of the training method for the multitask model in the specific implementation of the embodiment, and repeated details are not repeated.

Corresponding to the content recommendation method provided in the foregoing embodiment, based on the same technical concept, an embodiment of the present application further provides a content recommendation device, and fig. 4 is a schematic diagram of module compositions of the content recommendation device provided in the embodiment of the present application, where the content recommendation device is configured to execute the content recommendation method described in the foregoing embodiment, and as shown in fig. 4, the content recommendation device 400 includes: an obtaining module 401, configured to obtain browsing data and user characteristics of a target user; a prediction module 402, configured to input browsing data and user characteristics into a multitask model for content prediction, so as to obtain prediction information output by the multitask model, where the multitask model is a multitask model obtained through training according to the foregoing embodiment; and a recommending module 403, configured to recommend, according to the indication of the prediction information, the target content corresponding to the prediction information for the target user.

In a possible implementation manner, the prediction information includes predicted values of the click rate, the interaction rate, and the viewing duration of the candidate content, and the recommending module 403 is further configured to determine the candidate content with the highest ranking score according to the predicted values of the click rate, the interaction rate, and the viewing duration of the candidate content, and recommend the candidate content with the highest ranking score to the target user.

The content recommendation device provided in the embodiment of the present application can implement each process in the embodiment corresponding to the content recommendation method, and is not described here again to avoid repetition.

It should be noted that the content recommendation apparatus provided in the embodiment of the present application and the content recommendation method provided in the embodiment of the present application are based on the same inventive concept and have the same technical effect, so that for specific implementation of the embodiment, reference may be made to implementation of the content recommendation method, and repeated details are not repeated.

Corresponding to the content recommendation method or the training method of the multitask model provided in the foregoing embodiments, based on the same technical concept, an embodiment of the present application further provides an electronic device, where the electronic device is configured to execute the content recommendation method or the training method of the multitask model, and fig. 5 is a schematic structural diagram of an electronic device implementing various embodiments of the present invention, as shown in fig. 5. Electronic devices may vary widely in configuration or performance and may include one or more processors 501 and memory 502, where the memory 502 may have one or more stored applications or data stored therein. Memory 502 may be, among other things, transient or persistent storage. The application program stored in memory 502 may include one or more modules (not shown), each of which may include a series of computer-executable instructions for the electronic device. Still further, the processor 501 may be arranged in communication with the memory 502 to execute a series of computer-executable instructions in the memory 502 on the electronic device. The electronic device may also include one or more power supplies 503, one or more wired or wireless network interfaces 504, one or more input-output interfaces 505, and one or more keyboards 506.

In this embodiment, the electronic device includes a processor, a communication interface, a memory, and a communication bus; the processor, the communication interface and the memory complete mutual communication through a bus; a memory for storing a computer program; and the processor is used for executing the program stored in the memory and realizing the steps described in the embodiment of the method.

It should be noted that the electronic device provided in the embodiment of the present application and the content recommendation method or the training method of the multitask model provided in the embodiment of the present application are based on the same inventive concept and have the same technical effect, so that specific implementation of the embodiment may refer to implementation of the content recommendation method or the training method of the multitask model, and repeated details are not described here.

The embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps described in the foregoing method embodiments are implemented.

It should be noted that the computer-readable storage medium provided in the embodiment of the present application and the content recommendation method or the training method of the multitask model provided in the embodiment of the present application are based on the same inventive concept and have the same technical effects, so that reference may be made to the implementation of the content recommendation method or the training method of the multitask model in the specific implementation of the embodiment, and repeated details are not repeated.

In a specific embodiment, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement the steps described in the foregoing method embodiments.

It should be noted that the chip provided in the embodiment of the present application and the log file downloading method provided in the embodiment of the present application are based on the same inventive concept and have the same technical effect, so that specific implementation of the embodiment may refer to implementation of the aforementioned content recommendation method or the training method of the multitask model, and repeated details are not repeated.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, an electronic device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (transmyedia) such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A training method of a multitask model is characterized by comprising the following steps:

acquiring a training sample, wherein the training sample at least comprises user behavior data, user characteristics and historical browsing data corresponding to the user behavior data;

determining interest vector representation of a target user according to the user behavior data, the historical browsing data and the user characteristics;

inputting the interest vector representation into a multi-task model to be trained for training until a target loss function of the multi-task model to be trained is converged to obtain a trained multi-task model;

the multitask model comprises a sharing layer and a plurality of subtask models, wherein the sharing layer enables content sharing among the subtask models, the training samples serve as input of the subtask models, each subtask model outputs a predicted value aiming at a training target of each subtask model according to user behavior data, user characteristics and historical browsing data corresponding to the user behavior data in the training samples, and the training target comprises at least one of click rate, interaction rate and watching duration aiming at candidate content to be watched.

2. The method for training a multitask model according to claim 1, wherein the user behavior data includes a sequence of user target behaviors, a target behavior includes at least one of click behavior, interactive behavior and video watching behavior, different sequences of user target behaviors correspond to different historical browsing data, and the determining the interest vector representation of a target user according to the user behavior data, the historical browsing data and the user characteristics includes:

performing feature extraction on historical browsing data corresponding to the user target behavior sequence based on a self-attention mechanism model to obtain a plurality of interest representations of the target behavior sequence of the target user corresponding to the user features;

and performing weighted fusion on each interest representation to obtain the interest vector representation.

3. The method for training a multitask model according to claim 2, wherein the extracting features of the historical browsing data of the target behavior sequence of the user based on the self-attention mechanism model to obtain the interest characterization of the target behavior sequence of the target user corresponding to the user features comprises:

mapping the user characteristics, the user target behavior sequence and historical browsing data corresponding to the user target behavior sequence into a low-dimensional matrix;

determining the embedded representation of the target behavior sequence of the target user according to the low-dimensional matrix;

determining a new characterization of a sequence of target behaviors of the target user using the low-dimensional matrix and the embedded characterization based on the self-attention mechanism model;

calculating the association degree of the new characterization and the candidate content to be viewed;

and carrying out linear weighting on the embedded representation of the target behavior sequence of the target user according to the relevance to obtain the interest representation.

4. The method for training a multitask model according to claim 1, wherein said plurality of subtask models includes a click-through rate subtask model, an interaction rate subtask model, and a view time long subtask model, wherein a cross entropy function is used for a loss function of said click-through rate subtask model and a loss function of said interaction rate subtask model, and a root mean square error function is used for a loss function of said view time long subtask model.

5. The method of claim 4, wherein the target loss function is a weighted sum of the loss function of the click-through rate subtask model, the loss function of the interaction rate subtask model, and the loss function of the view duration subtask model.

6. The method of claim 5, wherein the weights of the target loss function of the click-through rate subtask model, the interaction rate subtask model, and the viewing duration subtask model are determined based on an uncertainty weighted learning strategy, the target loss function further comprising: and the sum of the weighted sum and the logarithm of the parameters in the uncertain weighted learning strategy.

7. A content recommendation method, characterized in that the content recommendation method comprises:

acquiring browsing data and user characteristics of a target user;

inputting the browsing data and the user characteristics into a multitask model for content prediction to obtain prediction information output by the multitask model and aiming at candidate content to be viewed, wherein the multitask model is obtained by training according to any one of claims 1 to 6;

and recommending candidate content corresponding to the prediction information for the target user according to the indication of the prediction information.

8. The content recommendation method according to claim 7, wherein the prediction information includes predicted values of click-through rate, interaction rate and viewing duration of candidate content, and the recommending the candidate content corresponding to the prediction information for the target user according to the indication of the prediction information comprises:

and determining the candidate content with the highest ranking score according to the click rate, the interaction rate and the predicted value of the watching duration of the candidate content, and recommending the candidate content with the highest ranking score to the target user.

9. An apparatus for training a multitask learning model, comprising:

the acquisition module is used for acquiring a training sample, wherein the training sample at least comprises user behavior data, user characteristics and historical browsing data corresponding to the user behavior data;

the determining module is used for determining the interest vector representation of the target user according to the user behavior data, the historical browsing data and the user characteristics;

the training module is used for inputting the interest vector representation into a multi-task model to be trained for training until a target loss function of the multi-task model to be trained is converged to obtain a trained multi-task model;

10. A content recommendation apparatus characterized by comprising:

the acquisition module is used for acquiring browsing data and user characteristics of a target user;

the prediction module is used for inputting the browsing data and the user characteristics into a multitask model for content prediction to obtain prediction information output by the multitask model, wherein the multitask model is trained according to any one of claims 1 to 6;

and the recommending module is used for recommending the target content corresponding to the prediction information for the target user according to the indication of the prediction information.