CN113850654A

CN113850654A - Training method of item recommendation model, item screening method, device and equipment

Info

Publication number: CN113850654A
Application number: CN202111246631.8A
Authority: CN
Inventors: 汪加林; 吴欢欢; 丁卓冶
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2021-10-26
Filing date: 2021-10-26
Publication date: 2021-12-28

Abstract

The application provides a training method of an article recommendation model, an article screening method, an article screening device and equipment, wherein the method comprises the following steps: the method comprises the steps of obtaining preset prediction tasks in a recommendation model, obtaining a user vector and an article vector of each prediction task according to user characteristics and article characteristics input to the recommendation model, determining a loss function of each prediction task according to the user vector, the article vector and a preset sample label of each prediction task, obtaining a target loss function of the recommendation model according to the loss function of each prediction task, obtaining a trained recommendation model according to the target loss function, and using the trained recommendation model to predict the preference degree of a user for different articles. According to the technical scheme, under the condition that the trained model can model multiple targets, the time consumed by coarse discharging of the model is reduced, and the coarse discharging efficiency of the model is improved.

Description

Training method of item recommendation model, item screening method, device and equipment

Technical Field

The application relates to the technical field of machine learning, in particular to a training method of an article recommendation model, an article screening method, an article screening device and article screening equipment.

Background

The online shopping platform is convenient for users to shop, and can recommend some articles which the users may want to buy for the users according to the past behavior records of the users. Before the articles are finally recommended to the user, the online platform generally recalls articles of tens of thousands of orders from a goods pool of tens of billions of orders, filters the articles into articles of thousands of orders through rough sorting, filters the articles into articles of hundreds of orders through fine sorting, and finally selects a few or dozens of articles in the top sorting for recommendation to the user through rearrangement.

In the prior art, in the course of performing rough sorting, a deep learning model based on vector inner products is mainly adopted, the user click rate is taken as a task target, and by combining the past behavior records and the article information of the user, articles which are possibly bought by the user in thousands of orders of magnitude are filtered out from articles in tens of thousands of orders of magnitude.

However, this method in the prior art can only model a single task target, which increases the time consumption of the rough-layout process and reduces the rough-layout efficiency when multiple task targets are used.

Disclosure of Invention

The application provides a training method of an article recommendation model, an article screening method, an article screening device and article screening equipment, which are used for solving the problem that the existing model coarse screening process is long in time consumption.

In a first aspect, an embodiment of the present application provides a method for training an item recommendation model, including:

acquiring preset prediction tasks in a recommendation model, wherein the number of the prediction tasks is at least two;

obtaining a user vector and an article vector of each prediction task according to the user characteristics and the article characteristics input to the recommendation model, wherein the user characteristics are used for indicating interaction behaviors of the user and the article and user information, and the article characteristics are used for indicating attribute information of the article;

determining a loss function of each prediction task according to the user vector, the article vector and a preset sample label of each prediction task;

obtaining a target loss function of the recommendation model according to the loss function of each prediction task;

and acquiring a trained recommendation model according to the target loss function, wherein the trained recommendation model is used for predicting the preference degree of the user to different articles.

In a possible design of the first aspect, the obtaining a user vector and an item vector for each prediction task according to the user features and the item features input to the recommendation model includes:

acquiring a feature domain of the user feature and a feature domain of the article feature;

according to the feature domain of the user feature and the feature domain of the article feature, dividing the user feature to obtain a user symmetric feature and a user asymmetric feature, and dividing the article feature to obtain an article symmetric feature and an article asymmetric feature;

according to the user symmetric features and the user asymmetric features, obtaining a user symmetric vector and a user asymmetric vector of each prediction task as the user vectors;

and obtaining an item symmetric vector and an item asymmetric vector of each prediction task as item vectors according to the item symmetric features and the item asymmetric features.

In another possible design of the first aspect, the obtaining a user vector and an item vector for each prediction task according to the user characteristics and the item characteristics input to the recommendation model includes:

according to the number of the predicted tasks, constructing multilayer perceptrons with target number in the recommendation model, wherein the multilayer perceptrons are used for carrying out feature modeling;

determining a multi-layer perceptron corresponding to the user symmetric characteristic, the user asymmetric characteristic, the article symmetric characteristic and the article asymmetric characteristic of each prediction task respectively;

modeling the user symmetric characteristic and the user asymmetric characteristic of each prediction task by using a corresponding multilayer perceptron to obtain a user characteristic vector and a user asymmetric characteristic vector corresponding to each prediction task;

and respectively modeling the article symmetric feature and the article asymmetric feature of each prediction task by using a corresponding multilayer perceptron to obtain an article symmetric vector and an article asymmetric vector of each prediction task.

In yet another possible design of the first aspect, the determining a loss function for each prediction task according to the user vector, the item vector, and the preset sample label for each prediction task includes:

splicing the user symmetric vector and the user asymmetric vector of each prediction task to obtain a first user splicing vector of each prediction task;

splicing the article symmetric vector and the article asymmetric vector of each prediction task to obtain a first article splicing vector of each prediction task;

and determining a loss function corresponding to each prediction task according to the first user splicing vector and the first article splicing vector of each prediction task.

In another possible design of the first aspect, the stitching the user symmetric vector and the user asymmetric vector of each prediction task to obtain a first user stitching vector of each prediction task includes:

splicing the user symmetric vector and the user asymmetric vector of each prediction task to obtain a first spliced vector after splicing;

and adding a first preset dimension into the first splicing vector to obtain a first user splicing vector of each prediction task.

In another possible design of the first aspect, the stitching the item symmetric vector and the item asymmetric vector of each prediction task to obtain a first item stitching vector of each prediction task includes:

splicing the article symmetric vector and the article asymmetric vector of each prediction task to obtain a second spliced vector after splicing;

and adding a second preset dimension into the second splicing vector to obtain a first article splicing vector of each prediction task.

In yet another possible design of the first aspect, the determining a loss function corresponding to each prediction task according to the first user splicing vector and the first item splicing vector of each prediction task includes:

performing inner product on the first user splicing vector and the first article splicing vector of each prediction task to obtain a first inner product result;

calculating a predicted value of each predicted task according to a preset activation function and the first inner product result;

and determining a loss function of each prediction task according to a preset cross entropy function, the prediction value and a preset sample label.

In yet another possible design of the first aspect, the obtaining the target loss function of the recommendation model according to the loss function of each predicted task includes:

acquiring a weight corresponding to the loss function of each prediction task, wherein the weight is used for representing the importance degree of the loss function of each prediction task to the target loss function;

and multiplying the loss function of each prediction task by the corresponding weight respectively, and adding to obtain the target loss function.

In yet another possible design of the first aspect, the obtaining a trained recommendation model according to the objective loss function includes:

and updating the parameters of the recommendation model by utilizing back propagation according to the target loss function to obtain the trained recommendation model.

In yet another possible design of the first aspect, before the obtaining the user vector and the item vector for each prediction task according to the user characteristic and the item characteristic input to the recommendation model, the method further includes:

acquiring the user characteristics in a preset user behavior log;

and acquiring the article characteristics from a preset offline characteristic log.

In a second aspect, an embodiment of the present application provides an article screening method, including:

acquiring user characteristics of a target user and article characteristics of a recalled article, wherein the recalled article is recalled from a preset article pool;

inputting the user characteristics and the article characteristics into a recommendation model to obtain a ranking result of the preference degree of the target user for the recalled articles, wherein the recommendation model is obtained by utilizing the user characteristics and the article characteristics of at least two prediction tasks;

and screening the recalled articles to obtain target articles according to the sorting result.

In one possible design of the second aspect, the inputting the user characteristic and the item characteristic into a recommendation model to obtain a result of ranking the preference of the target user for the recalled item includes:

obtaining a target user vector and a target article vector of each prediction task in the recommendation model according to the user characteristics and the article characteristics;

splicing the target user vectors of each prediction task to obtain a second user splicing vector;

after splicing the target object vector of each prediction task, multiplying the target object vector by a preset coefficient to obtain a second object splicing vector, wherein the preset coefficient is used for representing the weight among the prediction tasks;

performing inner product on the second user splicing vector and the second article splicing vector to obtain a second inner product result;

and determining a sequencing result of the preference degree of the target user for the recalled item according to the second inner product result.

In another possible design of the second aspect, the obtaining a target user vector and a target item vector for each prediction task in the recommendation model according to the user characteristics and the item characteristics includes:

acquiring a user symmetric characteristic and a user asymmetric characteristic from the user characteristics;

acquiring an article symmetric feature and an article asymmetric feature from the article features;

obtaining a target user vector of each prediction task according to the user symmetric characteristic and the user asymmetric characteristic of each prediction task;

and obtaining a target item vector of each prediction task according to the item symmetric characteristic and the item asymmetric characteristic of each prediction task.

In yet another possible design of the second aspect, the obtaining of the user characteristic of the target user and the item characteristic of the recalled item includes:

acquiring an article image from a preset article image database, and determining the article characteristics of the recalled article according to the article image;

the method comprises the steps of obtaining a user portrait in a preset user portrait database, and determining user characteristics of a target user according to the user portrait.

In a third aspect, an embodiment of the present application provides a training apparatus for an item recommendation model, including:

the task obtaining module is used for obtaining the prediction tasks in the recommendation model, and the number of the prediction tasks is at least two;

the vector acquisition module is used for acquiring a user vector and an article vector of each prediction task according to the user characteristics and the article characteristics input to the recommendation model, wherein the user characteristics are used for indicating interaction behaviors of a user and an article and user information, and the article characteristics are used for indicating attribute information of the article;

the function determining module is used for determining a loss function of each prediction task according to the user vector, the article vector and the preset sample label of each prediction task;

the function obtaining module is used for obtaining a target loss function of the recommendation model according to the loss function of each prediction task;

and the model training module is used for acquiring a trained recommendation model according to the target loss function, and the trained recommendation model is used for predicting the preference degree of the user on different articles.

In a fourth aspect, an embodiment of the present application provides an article screening apparatus, including:

the system comprises a characteristic acquisition module, a characteristic acquisition module and a characteristic analysis module, wherein the characteristic acquisition module is used for acquiring user characteristics of a target user and article characteristics of a recalled article, and the recalled article is recalled from a preset article pool;

the item sorting module is used for inputting the user characteristics and the item characteristics into a recommendation model to obtain a sorting result of the preference degree of the target user for the recalled items, and the recommendation model is obtained by utilizing the user characteristics and the item characteristics of at least two prediction tasks;

and the article screening module is used for screening the recalled articles to obtain target articles according to the sorting result.

In a fifth aspect, an embodiment of the present application provides a computer device, including: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to implement the method described above.

In a sixth aspect, an embodiment of the present application provides a server, including: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

In a seventh aspect, the present application provides a readable storage medium, in which computer instructions are stored, and when executed by a processor, the computer instructions are used to implement the method as described above.

In an eighth aspect, the present application provides a program product including computer instructions, which when executed by a processor implement the method as described above.

According to the training method, the article screening method, the device and the equipment of the article recommendation model, a plurality of prediction tasks are arranged in the model, feature modeling is carried out on each prediction task, and a loss function corresponding to each prediction task is obtained according to a feature vector after modeling. And calculating a target loss function of the model according to the loss function of each prediction task, so that the time consumed by coarse elimination of the model can be reduced and the coarse elimination efficiency of the model can be improved under the condition that the trained model can model multiple targets.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application;

fig. 1 is a schematic structural diagram of a recommendation system provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a training method of an item recommendation model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a model structure of a recommended model provided in the present application;

fig. 4 is a schematic flow chart of an article screening method according to an embodiment of the present disclosure;

fig. 5 is a system framework diagram of an article screening system provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of a training apparatus for an item recommendation model according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an article screening apparatus according to an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of a computer device provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a server according to an embodiment of the present application.

With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms referred to in this application are explained first:

MLP：

a Multilayer Perceptron (MLP), also known as an Artificial Neural Network (ANN), maps a set of input vectors to a set of output vectors. An MLP can be viewed as a directed graph, consisting of multiple levels of nodes, each level being fully connected to the next level. Each node, except the input nodes, is a neuron (or processing unit) with a nonlinear activation function.

Fig. 1 is a schematic structural diagram of a recommendation system according to an embodiment of the present disclosure, and as shown in fig. 1, the goods displayed on the shopping platform are usually recorded in a goods pool, which may be in the order of several billions. In order to screen some articles with better quality from a goods pool with hundred million orders, the articles are pushed to a user and need to be operated through a series of modules such as a recall module, a rough arrangement module, a fine arrangement module and a rearrangement module. The purpose of the coarse arranging module is mainly to obtain good-quality articles from recalled articles through further screening, and then the articles are sent to the fine arranging module, and the pressure of the fine arranging module can be relieved while the recommendation effect is guaranteed. The model in the coarse layout module is as simple as possible, the time consumed by coarse layout is limited, and the time consumed by coarse layout is reduced while the recommendation effect is ensured as much as possible. During the course of the rough ranking, targets such as click through rate and conversion rate may be added to ensure that the click through rate and conversion rate of the article obtained by the rough ranking are relatively high. The click rate is the probability that the user will click on the item on the shopping platform, and the conversion rate is the probability that the user will purchase the item on the shopping platform.

In the prior art, methods for sorting recalled articles in a rough sort module of a recommendation system are mainly classified into three methods. The first is the simplest method to sort recalled items by selecting some features that reflect the quality of the items, but this approach is targeted to all users, i.e., each user is the same, and push personalization is not considered. The second is ranking using a simple shallow model, such as Logistic Regression (LR) model, which is less personalized. The third is to use a deep learning model based on vector inner product. However, when the deep learning model using the vector inner product is recalled, only a single target can be modeled, if multiple targets are modeled, the time consumed by the model is increased, and the existing deep learning model using the vector inner product does not fully utilize the characteristics of the object and the user, so that the accuracy of the recommendation system is not good, and the poor recommendation effect is caused.

In order to solve the above problems, embodiments of the present application provide a training method, an article screening device, and an article recommendation model, where a user vector and an article vector of each prediction task are calculated, a loss function of each prediction task is determined, and finally a target loss function of the entire recommendation model is obtained according to the loss function of each prediction task, and the recommendation model is trained, and the recommendation model obtained through training can calculate a vector inner product only once when calculating user preferences for different articles, without calculating the vector inner product of each prediction task separately, so that calculation time for rough ranking is reduced, and rough ranking efficiency is improved.

The technical solution of the present application will be described in detail below with reference to specific examples. It should be noted that the following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

Fig. 2 is a schematic flowchart of a training method for an item recommendation model according to an embodiment of the present application, where the method may be applied to a local computer device, and as shown in fig. 2, the training method may specifically include the following steps:

s201, obtaining a preset prediction task in the recommendation model.

Wherein the number of prediction tasks is at least two. Illustratively, the prediction tasks may be click-through rate tasks and conversion rate tasks.

In this embodiment, the recommendation model is applied to the rough ranking module, ranks the recalled articles, further screens the recalled articles to obtain articles with better quality, and then sends the articles to the fine ranking module.

S202, obtaining a user vector and an article vector of each prediction task according to the user characteristics and the article characteristics input into the recommendation model.

The user characteristics are used for indicating the interaction between the user and the article and user information, and the article characteristics are used for indicating the attribute information of the article. For example, the interactive behavior may refer to a behavior that a user clicks, browses, joins a shopping cart, purchases and the like on a shopping platform, the user information may include the age, sex, occupation and the like of the user, and the attribute information of the item may be information of the category, brand, price and the like of the item.

In some embodiments, the user characteristics and the item characteristics may be collected in advance, for example, when the user uses the shopping platform, the generated user characteristics may be collected in a preset user behavior log, and the item characteristics may be stored in a preset offline characteristic log in advance, that is, the user characteristics may be obtained from the preset user behavior log, and the item characteristics may be obtained from the preset offline characteristic log.

In this embodiment, the user feature may be modeled by the first MLP to obtain a user vector. And modeling the article features through a second MLP to obtain an article vector. Illustratively, the first MLP and the second MLP are not identical.

Illustratively, taking the predicted task comprising a user passing Rate (CTR) and a user Click Through Conversion Rate (CTCVR) as an example, the user characteristics and the item characteristics of different predicted tasks are respectively modeled by using different MLP networks.

Wherein, the user characteristic x is used_uArticle characteristic x_iFor example. The user vector of the user click rate is

The item vector for user click rate is

The user vector of the user click through conversion rate is

The user clicks through the item vector of conversion rate as

Wherein the content of the first and second substances,

s203, determining a loss function of each prediction task according to the user vector, the item vector and the preset sample label of each prediction task.

In this embodiment, each prediction task has a corresponding preset sample label, and for example, the preset sample label corresponding to the user click rate is label₁The user clicks the preset sample label corresponding to the conversion rate as label₂。

For example, the loss function of each prediction task can be obtained by the following formula:

loss_ctr＝CE(ctr，label₁)

loss_ctcvr＝CE(ctcvr，label₂)

in the above formula, ctr represents the predicted value of the user click rate, ctcvr represents the predicted value of the user click through conversion rate, and CE () represents the cross entropy function, loss_ctrLoss function, loss, representing user click-through rate_ctcvrA loss function representing the user click through conversion rate.

And S204, obtaining a target loss function of the recommendation model according to the loss function of each prediction task.

For example, the target loss function can be calculated by the following formula:

Loss＝α*loss_ctr+β*loss_ctcvr

in the above formula, Loss represents a target Loss function, and α and β are preset hyper-parameters, which respectively represent the importance degrees of the user click rate prediction task and the user click through conversion rate prediction task to the target Loss function. Illustratively, α and β may be preset or obtained by machine learning.

And S205, acquiring a trained recommendation model according to the target loss function.

For example, in some embodiments, parameters of the recommendation model may be updated by using back propagation according to the objective loss function, so as to obtain the trained recommendation model. Specifically, parameters of the recommendation model can be updated according to back propagation in deep learning, and the parameters are updated by continuously comparing the output value and the expected value of the target loss function until the optimized parameters are obtained, so that the training process of the recommendation model is completed.

According to the embodiment of the application, a plurality of prediction tasks are arranged in the model, each prediction task is subjected to feature modeling, and a loss function corresponding to each prediction task is obtained according to a feature vector after modeling. And calculating a target loss function of the model according to the loss function of each prediction task, so that the time consumed by coarse elimination of the model can be reduced and the coarse elimination efficiency of the model can be improved under the condition that the trained model can model multiple targets.

Further, in some embodiments, the user characteristic may be divided into a user symmetric characteristic and a user asymmetric characteristic, and the item characteristic may be divided into an item symmetric characteristic and an item asymmetric characteristic. The step S202 can be specifically realized by the following steps:

acquiring a feature domain of a user feature and a feature domain of an article feature;

according to the user symmetric features and the user asymmetric features, obtaining a user symmetric vector and a user asymmetric vector of each prediction task as user vectors;

In this embodiment, the feature field is used to indicate a feature, for example, "occupation" may be used as a feature field, and its code may be 000100, which indicates that the feature field contains 6 features. The user symmetric feature refers to a feature of the same feature field which also exists in the article symmetric feature, and the article symmetric feature refers to a feature of the same feature field which also exists in the user symmetric feature. The user asymmetric feature is a feature without the same feature domain in the item feature, and the item asymmetric feature is a feature without the same feature domain in the user feature domain.

For example, the user item category preference in the user symmetric feature, the user item brand preference, and the item category, item brand feature in the item symmetric feature are corresponding. The gender of the user in the user asymmetric feature and the price of the item in the item asymmetric feature do not correspond, since gender is a feature unique to the user and price is a feature unique to the item.

Illustratively, the user symmetry feature may be represented by x_u，symThe symmetric characteristic of the article can be represented by x_i，symThe user asymmetry feature may be represented by x_u，asymThe commodity symmetry feature can be represented by x_i，asymAnd (4) showing.

The user symmetric vector, the user asymmetric vector, the item symmetric vector and the item asymmetric vector of each prediction task can be calculated by the following formulas.

Wherein the content of the first and second substances,

a user symmetry vector representing the CTR,

a user symmetry vector representing the CTCVR,

the article symmetry vector representing the CTR,

representing the item symmetry vector of CTCVR.

A user asymmetry vector representing the CTR,

a user asymmetry vector representing the CTCVR,

the asymmetric vector of the article, B, representing the CTR_i ^ctcvrThe article asymmetry vector representing CTCVR.

According to the embodiment of the application, the user characteristics and the article characteristics are symmetrically and asymmetrically distinguished, compared with the traditional method for uniformly modeling all the characteristics, the information contributed by the asymmetric characteristics of the user cannot find the correspondence in the information contributed by the article characteristics, noise brought to the model can be avoided, and the accuracy of the model is improved.

For example, fig. 3 is a schematic diagram of a model structure of a recommendation model provided in the present application, taking the predicted tasks including the user click rate and the user click through conversion rate as an example, as shown in fig. 3, the recommendation model includes a plurality of multi-layer perceptron MLPs, and the number of the multi-layer perceptron MLPs is proportional to the number of the predicted tasks. Specifically, when the number of the prediction tasks is two, the user symmetric feature, the user asymmetric feature, the item symmetric feature and the item asymmetric feature of each prediction task are respectively and correspondingly input into one MLP.

The user symmetric feature of each prediction task is modeled by MLP and outputs a user symmetric vector 31, and the user asymmetric feature of each prediction task is modeled by MLP and outputs a user asymmetric vector 32. The item symmetry features of each prediction task are modeled by MLP and will output an item symmetry vector 33, and the item asymmetry features of each prediction task are modeled by MLP and will output an item asymmetry feature 34.

After the user symmetric vector 31 and the user asymmetric vector 32 of each prediction task are spliced, dimension 1 can be added, and finally, the first user splicing vector of each prediction task is obtained. After the item symmetric vector 33 and the item asymmetric vector 34 of each prediction task are spliced, dimension 1 can be added, and finally the first item splicing vector of each prediction task is obtained.

The first user stitching vector and the first item stitching vector are used for calculating a user click through rate CTR or a loss function of the user click through conversion rate CTCVR, and SOCRE represents the score of each recalled item when the recalled items are ranked by the trained recommendation model. Higher scores indicate a higher user preference for the recalled item.

In this embodiment, the step S202 may be specifically implemented by the following steps:

according to the number of the predicted tasks, constructing a multi-layer perceptron with the target number in a recommendation model;

acquiring a multi-layer perceptron corresponding to the user symmetric characteristic, the user asymmetric characteristic, the article symmetric characteristic and the article asymmetric characteristic of each prediction task respectively;

Wherein the multi-layer perceptron is used for feature modeling.

Specifically, the number of the prediction tasks and the number of the multi-layer perceptrons may be in a proportional relationship with the target number. Illustratively, when the predicted tasks are two tasks of user click rate and user click through conversion rate, each feature corresponds to two multi-layer perceptron MLPs. The user symmetrical feature corresponds to two multi-layer perceptrons MLPs, the user asymmetrical feature also corresponds to two multi-layer perceptrons MLPs, the article symmetrical feature also corresponds to two multi-layer perceptrons MLPs, and the article asymmetrical feature also corresponds to two multi-layer perceptrons MLPs.

And each multi-layer perceptron MLP is used for modeling the characteristics of each prediction task to obtain a characteristic vector. Referring to fig. 3, with a target number of multi-layered perceptrons MLP, two user symmetric vectors, two user asymmetric vectors, two item symmetric vectors, and two item asymmetric vectors may be output.

According to the embodiment of the application, the corresponding multilayer perceptron is set for the user symmetric characteristic, the user asymmetric characteristic, the article symmetric characteristic and the article asymmetric characteristic of each prediction task, multiple characteristics can be modeled, a vector corresponding to each characteristic is obtained, modeling for multiple targets (such as user click rate and user click through conversion rate) is achieved, the preference of a user to an article can be pre-estimated by fully utilizing the user asymmetric characteristic and the article asymmetric characteristic, and the accuracy of the model is improved.

Further, on the basis of the above embodiments, in some embodiments, the step S203 may be implemented by:

In this embodiment, stitching refers to merging two vectors into one vector. In brief, taking the first vector as (1, 1, 1) and the second vector as (1, 0, 0) as an example, the two vectors are spliced to obtain (1, 1, 1, 1, 0, 0).

For example, taking two prediction tasks, i.e. the prediction task comprises a user click rate and a user click through conversion rate, the user symmetry vector of the user click rate is

The user asymmetric vector is

The article symmetry vector is

The article has an asymmetric vector of

The user symmetry vector of the user click through conversion rate is

The user asymmetric vector is

The article symmetry vector is

The asymmetric vector of the article is B_i ^ctcvrFor example. The first user's stitching vector of the user's click rate is

The first article splicing vector is

The first user's stitching vector of user click through conversion rate is

The first article splicing vector is

Wherein, the user click rate and the estimated value of the conversion rate of the user click can be calculated by the following formulas:

in the above formula, ctr represents an estimated value of a user click rate, ctcvr represents an estimated value of a user click through conversion rate, sigmoid is an activation function and is used for hidden layer neuron output, a value range is (0, 1), and a real number can be mapped to an interval of (0, 1).

The user click rate and the loss function of user click through conversion rate can be calculated by the following formula:

loss_ctr＝CE(ctr，label₁)

loss_ctcvr＝CE(ctcvr，label₂)

in the above formula, loss_ctrLoss function, loss, representing user click-through rate_ctcvrRepresenting a loss function of user click through conversion rate, representing CE () as a cross entropy function, and representing a preset sample label corresponding to the user click rate as label₁The user clicks the preset sample label corresponding to the conversion rate as label₂。

According to the method and the device, the asymmetric vector and the symmetric vector are introduced and spliced, the non-corresponding vector of the user and the asymmetric vector of the article can be fully utilized as the bias item, the preference of the user to the article is pre-estimated, and the accuracy of the model is improved.

Further, on the basis of the foregoing embodiments, in some embodiments, the step "splicing the user symmetric vector and the user asymmetric vector of each prediction task to obtain the first user splicing vector of each prediction task" may be specifically implemented by the following steps:

Illustratively, the first preset dimension may be dimension 1. The first splicing vector obtained by splicing the user symmetric vector and the user asymmetric vector of the user click rate is

The first user splicing vector takes the value of

By analogy, users click through conversion ratesThe first user splicing vector takes the value of

Further, on the basis of the foregoing embodiments, in some embodiments, the step "splicing the article symmetric vector and the article asymmetric vector of each prediction task to obtain the first article splicing vector of each prediction task" may be specifically implemented by the following steps:

and adding a second preset dimension into the second splicing vector to obtain the first article splicing vector of each prediction task.

Illustratively, the second preset dimension may be dimension 1. If the second splicing vector obtained by splicing the article symmetric vector and the article asymmetric vector of the user click rate is

The first item stitching vector of the user click rate may take the value of

By analogy, the first article splicing vector of the user click through rate can be valued as

Further, on the basis of the foregoing embodiments, in some embodiments, the step "determining the loss function corresponding to each prediction task according to the first user splicing vector and the first article splicing vector of each prediction task" may specifically be implemented by the following steps:

calculating a predicted value of each prediction task according to a preset activation function and the first inner product result;

and determining a loss function of each prediction task according to the preset cross entropy function, the predicted value and the preset sample label.

Specifically, taking the prediction tasks including the user click rate and the user click through conversion rate as an example, the loss function of each prediction task can be calculated by the following formula.

loss_ctr＝CE(ctr，label₁)

loss_ctcvr＝CE(ctcvr，label₂)

On the basis of the foregoing embodiments, in some embodiments, the step S204 may be specifically implemented by the following steps:

acquiring the weight corresponding to the loss function of each prediction task;

and multiplying the loss function of each prediction task by the corresponding weight respectively, and adding to obtain a target loss function.

Wherein the weight is used for representing the importance degree of the loss function of each prediction task to the target loss function.

Specifically, each time there is a prediction task, there is a weight corresponding to it.

In this embodiment, the target loss function can be calculated by the following formula:

Loss＝α*loss_ctr+β*loss_ctcvr

in the above formula, Loss represents a target Loss function, α is a weight corresponding to the Loss function of the user click through rate, β is a weight corresponding to the conversion rate of the user click through rate, and α and β represent the importance degree of the Loss function of the user click through rate and the importance degree of the Loss function of the user click through conversion rate to the target Loss function respectively.

According to the embodiment of the application, the prediction dimensionality is added in the splicing adjacency, the loss function of each prediction task is calculated, then the loss functions of each prediction task are summed to obtain the target loss function of the model, multi-target modeling can be achieved under the condition that the calculation complexity of the model is not changed, and multi-dimensionality depiction is conducted on the intention of a user. Compared with the traditional multi-task prediction mode, the method and the device have the advantages that the target loss function is obtained by summing the loss functions of all the prediction tasks, the user preference degree can be predicted only by performing calculation once, the prediction result of each target does not need to be calculated respectively, then the user preference degree is predicted according to the prediction result of each target, the calculated amount of the model is reduced, and the efficiency of the model is improved on the premise of ensuring multi-target modeling.

Fig. 4 is a schematic flow chart of an article screening method according to an embodiment of the present application, where the article screening method may use the trained recommendation model to implement the technical solution. As shown in fig. 4, the article screening method may include the steps of:

s401, obtaining user characteristics of the target user and article characteristics of the recalled articles.

Wherein the recalled articles are recalled from a preset article pool. Illustratively, the pool of forecast items contains all items sold throughout the shopping platform, which can be on the order of billions. Recalling refers to retrieving some items that may be of interest to the user from the full pool of items according to some rule or model.

In this embodiment, the target users may be all users of the shopping platform, each user has different user characteristics, for example, some users are interested in the mobile phone, and the user characteristics may be some interactive behaviors of the mobile phone sold with the shopping platform, for example, a mobile phone clicking a certain brand, a mobile phone purchasing a certain brand, and the like. While some users may be interested in the computer, the user characteristics may be some interaction with the computer being sold on the shopping platform.

S402, inputting the user characteristics and the article characteristics into a recommendation model to obtain a sequencing result of the preference degree of the target user for the recalled articles.

The recommendation model is obtained by training with the user characteristics and the article characteristics of at least two prediction tasks, and is exemplarily used for predicting the preference degree of the user for different articles. For example, taking an article as a mobile phone, a user may prefer a certain brand of mobile phone, and the mobile phone of the brand is the top in the ranking result, and the ranking is the back for a brand that the user dislikes.

And S403, screening the recalled articles to obtain target articles according to the sorting result.

For example, the sorting result may include tens of thousands or thousands of items, and the items of thousands are selected as the target items. The target articles may be preferred by the target user, and the target articles are continuously fed to the fine sorting module for secondary screening and filtering, hundreds of orders of articles are filtered from thousands of orders of target articles, and finally, the articles are pushed to the user after being rearranged by the rearrangement module.

According to the method and the device, the target articles are screened from the recalled articles by utilizing the recommendation models obtained by training the user characteristics and the article characteristics of the multiple prediction tasks, the calculation complexity of the recommendation models can be reduced, the time consumption is reduced, and the prediction efficiency is improved on the basis of modeling multiple targets.

In some embodiments, the step S402 may be specifically implemented by the following steps:

after splicing the target article vectors of each prediction task, multiplying the target article vectors by a preset coefficient to obtain a second article splicing vector;

The preset coefficients are used for representing the weight among all the prediction tasks.

In this embodiment, the preference degree of the target user for each recalled item may be determined by calculating the score of each item, and the ranking result may be obtained by ranking according to the preference degree.

Specifically, taking the user click rate and the user click through conversion rate as examples, the target user vector of the user click rate and the target user vector of the user click through conversion rate may be spliced to obtain the second user spliced vector. And splicing the target object vector of the user click rate and the target object vector of the user click through conversion rate, and multiplying the target object vector by a preset coefficient to obtain the target object vector.

According to the embodiment of the application, the inner product of the second user splicing vector and the second article splicing vector is carried out, so that the calculation time consumption is reduced and the working efficiency of the recommendation model is improved under the condition that the complexity of the recommendation model is not changed.

Further, in some embodiments, the step of "obtaining the target user vector and the target item vector of each prediction task in the recommendation model according to the user characteristics and the item characteristics" may be specifically implemented by the following steps:

acquiring user symmetric characteristics and user asymmetric characteristics from the user characteristics;

acquiring an article symmetric characteristic and an article asymmetric characteristic from the article characteristic;

Specifically, the user characteristics can be divided into user symmetric characteristics and user asymmetric characteristics, and the article characteristics can be divided into article symmetric characteristics and article asymmetric characteristics. The user symmetry vector corresponding to the user symmetry feature of the user click rate is

The user asymmetric vector is

The article symmetry vector is

The article has an asymmetric vector of

The user symmetric vector corresponding to the conversion rate when the user clicks is

The user click through conversion rate corresponds to the user asymmetry characteristic of

The article symmetry vector is

The article has an asymmetric vector of

The score for each item can be calculated by the following formula:

in the above formula, score represents the score of each item,

a target user vector representing a user's click rate,

a target item vector representing a user's click rate,

and the target user vector of the user click rate is spliced with the target user vector of the conversion rate clicked by the user to obtain a second user splicing vector.

A target item vector representing a user's click rate,

a target item vector representing the user's click through conversion rate,

and a second item splicing vector obtained by splicing the target item vector of the user click rate and the target item vector of the user click through conversion rate is represented, and gamma represents a preset coefficient.

For example, fig. 5 is a system framework diagram of an article screening system according to an embodiment of the present application, and as shown in fig. 5, an article image, for example, data related to an article, such as category, brand, price, and the like, is stored in a preset article image database 51 of an article screening system 50. The preset user profile database 52 stores the historical behavior of the user and user information, such as the user's age, gender, interest category, brand preference, and purchasing power.

The model training can obtain training data from a user behavior log 53 and an offline feature log 54, wherein the preset user behavior log 53 is a log of behaviors of a user in browsing, clicking, car adding, purchasing and the like on a shopping platform, and the preset offline feature log 54 is a feature log for calculating a ranking function of a rough ranking module 55, a fine ranking module 56 and the like according to a commodity portrait and a user portrait by a feature calculation service when the user accesses the shopping platform.

The rough ranking module 55 is configured to efficiently rank the articles recalled by the recall module 57, further screen out articles that a user may be interested in, and reduce the calculation pressure of the fine ranking module 56. The fine ranking module 56 uses a more complex model, richer features, to rank the results of the coarse ranking.

For example, an article image may be obtained in the preset article image database 51, and article features of the recalled article may be determined according to the article image; a user profile is obtained in a preset user profile database 52, and user characteristics of a target user are determined based on the user profile.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Fig. 6 is a schematic structural diagram of a training apparatus for an item recommendation model according to an embodiment of the present application, where the apparatus may be integrated on a computer device, or may be independent of the computer device and cooperate with the computer device to complete the technology of the present application. As shown in fig. 6, the training device 60 may specifically include a task obtaining module 61, a vector obtaining module 62, a function determining module 63, a function obtaining module 64, and a model training module 65.

The task obtaining module 61 is configured to obtain a prediction task in the recommendation model. The vector obtaining module 62 is configured to obtain a user vector and an item vector for each prediction task according to the user features and the item features input to the recommendation model. The function determining module 63 is configured to determine a loss function of each prediction task according to the user vector, the item vector, and the preset sample label of each prediction task. The function obtaining module 64 is configured to obtain a target loss function of the recommendation model according to the loss function of each prediction task. The model training module 65 is configured to obtain a trained recommendation model according to the target loss function.

The number of the prediction tasks is at least two, the user characteristics are used for indicating the interaction behavior of the user and the articles and the user information, the article characteristics are used for indicating the attribute information of the articles, and the trained recommendation model is used for predicting the preference degree of the user to different articles.

In some embodiments, the vector obtaining module may be specifically configured to:

Optionally, in some embodiments, the vector obtaining module may be configured to:

according to the number of the predicted tasks, constructing multilayer perceptrons with the target number in a recommendation model, wherein the multilayer perceptrons are used for carrying out feature modeling;

In some embodiments, the function determination module may be configured to:

Optionally, in some embodiments, the function determining module may be specifically configured to:

In some embodiments, the function obtaining module may be specifically configured to:

In some embodiments, the model training module may be specifically configured to:

and updating parameters of the recommendation model by utilizing back propagation according to the target loss function to obtain the trained recommendation model.

In some embodiments, the training device of the item recommendation model may further include a feature obtaining module, configured to obtain the user feature in a preset user behavior log, and obtain the item feature in a preset offline feature log.

The device provided by the embodiment of the application can be used for executing the training method of the item recommendation model in the embodiment, the implementation principle and the technical effect are similar, and details are not repeated here.

Fig. 7 is a schematic structural diagram of an article screening apparatus according to an embodiment of the present application, where the article screening apparatus may be integrated on a server, or may be independent of the server and cooperate with the server to complete the technical solution of the present application. As shown in fig. 7, the item sorting apparatus 70 includes a feature acquisition module 71, an item sorting module 72, and an item sorting module 73.

The feature acquiring module 71 is configured to acquire a user feature of the target user and an item feature of the recalled item. The item ranking module 72 is configured to input the user characteristics and the item characteristics into the recommendation model, and obtain a ranking result of the preference of the target user for the recalled items. The article screening module 73 is configured to screen the recalled articles to obtain target articles according to the sorting result.

The recalled articles are recalled from a preset article pool, and the recommendation model is obtained by training by using user characteristics and article characteristics of at least two prediction tasks.

For example, in some embodiments, the article sorting module may be specifically configured to:

Optionally, in some embodiments, the article sorting module may be specifically configured to:

For example, in some embodiments, the feature obtaining module 71 may be specifically configured to:

acquiring an article image from a preset article image database, and determining the article characteristics of a recalled article according to the article image;

and acquiring the user portrait from a preset user portrait database, and determining the user characteristics of the target user according to the user portrait.

The device provided by the embodiment of the application can be used for executing the article screening method in the embodiment, the implementation principle and the technical effect are similar, and the details are not repeated herein.

It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the article sorting module may be a processing element separately set up, or may be stored in the memory of the apparatus in the form of program code, and a certain processing element of the apparatus calls and executes the functions of the article sorting module. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized.

For example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call the program code.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 8, the computer device 80 includes: at least one processor 81, memory 82, bus 83, and communication interface 84.

Wherein: the processor 81, the communication interface 84 and the memory 82 communicate with each other via a bus 83.

The communication interface 84 is used for communication with other devices. The communication interface comprises a communication interface for data transmission, a display interface or an operation interface for man-machine interaction and the like.

The processor is configured to execute the computer-executable instructions stored in the memory 82, and may specifically perform the relevant steps in the training method of the item recommendation model described in the above embodiments.

Fig. 9 is a schematic structural diagram of a server according to an embodiment of the present application, and as shown in fig. 9, the server 90 includes: at least one processor 91, memory 92, bus 93, and communication interface 94.

Wherein: the processor 91, the communication interface 94 and the memory 92 communicate with each other via a bus 93.

The communication interface 94 is used for communication with other devices. The communication interface includes a communication interface for data transmission.

The processor may be a central processing unit. The one or more processors included in the computer device and the server can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And the memory is used for storing computer execution instructions. The memory may comprise high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.

The present embodiment also provides a readable storage medium, in which computer instructions are stored, and when at least one processor of a computer device or a server executes the computer instructions, the computer device executes the above various embodiments to provide the training method for the item recommendation model, and the server executes the item screening method provided by the above various embodiments.

The present embodiments also provide a program product comprising computer instructions stored in a readable storage medium. The computer instructions may be read from a readable storage medium by at least one processor of a computer device, and execution of the computer instructions by the at least one processor causes the computer device to implement the method for training an item recommendation model provided by the various embodiments described above. The computer instructions may be read from a readable storage medium by at least one processor of the server, and execution of the computer instructions by the at least one processor causes the server to implement the item screening method provided by the various embodiments described above.

In the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula, the character "/" indicates that the preceding and following related objects are in a relationship of "division". "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for convenience of description and distinction and are not intended to limit the scope of the embodiments of the present application. In the embodiment of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A training method of an item recommendation model is characterized by comprising the following steps:

2. The method of claim 1, wherein the deriving the user vector and the item vector for each of the predicted tasks based on the user characteristics and the item characteristics input to the recommendation model comprises:

3. The method of claim 2, wherein the deriving the user vector and the item vector for each of the predicted tasks based on the user characteristics and the item characteristics input to the recommendation model comprises:

4. The method of claim 3, wherein determining the loss function for each prediction task based on the user vector, the item vector, and the preset sample label for each prediction task comprises:

5. The method of claim 4, wherein the stitching the user symmetric vector and the user asymmetric vector of each prediction task to obtain a first user stitching vector of each prediction task comprises:

6. The method of claim 4, wherein the stitching the item symmetry vector and the item asymmetry vector for each prediction task to obtain a first item stitching vector for each prediction task comprises:

7. The method of claim 6, wherein determining the loss function for each of the plurality of predicted tasks based on the first user stitching vector and the first item stitching vector for each of the plurality of predicted tasks comprises:

8. The method of claim 1, wherein obtaining the target loss function of the recommendation model based on the loss function of each predicted task comprises:

9. The method of claim 1, wherein obtaining the trained recommendation model according to the objective loss function comprises:

10. The method of claim 1, wherein before deriving the user vector and the item vector for each of the plurality of predicted tasks based on the user characteristics and the item characteristics input to the recommendation model, further comprising:

acquiring the user characteristics in a preset user behavior log;

11. A method of screening an article, comprising:

12. The method of claim 11, wherein the inputting the user characteristic and the item characteristic into a recommendation model to obtain a result of ranking the target user's preference for the recalled item comprises:

13. The method of claim 12, wherein obtaining a target user vector and a target item vector for each predicted task in the recommendation model based on the user characteristics and item characteristics comprises:

14. The method of claim 11, wherein the obtaining of the user characteristics of the target user and the item characteristics of the recalled item comprises:

15. An apparatus for training an item recommendation model, comprising:

16. An article screening apparatus, comprising:

17. A computer device, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to implement the method of any of claims 1-10.

18. A server, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to implement the method of any of claims 11-14.

19. A readable storage medium having stored therein computer instructions, which when executed by a processor, are adapted to implement the method of any one of claims 1-14.

20. A program product comprising computer instructions, characterized in that the computer instructions, when executed by a processor, implement the method of any of claims 1-14.