Disclosure of Invention
In view of this, embodiments of the present disclosure provide a recommendation model training method and apparatus, a recommendation method and apparatus, a computing device, and a computer-readable storage medium, so as to solve technical defects in the prior art.
In a first aspect, an embodiment of the present specification discloses a training method for a recommendation model, including:
acquiring user characteristics of at least two sample users and attribute characteristics of at least two sample application programs;
generating a positive sample clicked and purchased by a sample user on the exposed sample application program and a negative sample clicked but not purchased or clicked by the sample user on the exposed sample application program based on the user characteristics and the attribute characteristics;
training a recommendation model based on a sample set comprising at least one positive sample and one negative sample to obtain the recommendation model, wherein the recommendation model outputs exposure conversion rate obtained by each sample user based on click rate and purchase rate of each exposed sample application program.
Optionally, before training the recommendation model based on a sample set including at least one positive sample and one negative sample, the method further includes:
and screening the sample set into a training sample set comprising at least one positive sample and at least one negative sample and a test sample set comprising at least one positive sample and at least one negative sample based on a preset screening rule.
Optionally, training the recommendation model based on a set of samples comprising at least one positive sample and one negative sample comprises:
training a recommendation model based on the set of training samples comprising at least one positive sample and one negative sample.
Optionally, after training the recommendation model based on a sample set including at least one positive sample and one negative sample, the method further includes:
the recommendation model is tested based on a set of test samples including at least one positive sample and one negative sample.
Optionally, the recommendation model includes a deep fm multitask learning model, and the deep fm multitask learning model includes a click rate estimation part and a purchase rate estimation part.
Optionally, the recommending model outputting the exposure conversion rate obtained by each sample user based on the click rate and the purchase rate of the sample application program for each exposure comprises:
the click rate estimation part of the recommendation model outputs the click rate of each sample user to each exposed sample application program;
a purchase rate estimation part of the recommendation model outputs the purchase rate of each sample user to each exposed sample application program;
determining an exposure conversion rate for each sample user for each exposed sample application based on the click through rate and the purchase rate.
Optionally, the user features and the attribute features each include offline features and real-time features, wherein the offline features include historical features of the collected sample users and the sample applications, and the real-time features include features of the collected sample users and the sample applications at the time of events.
In a second aspect, an embodiment of the present specification further provides a recommendation method, including:
receiving a recommendation request of a user to be recommended for an exposed application program, wherein the user to be recommended carries a user identifier;
determining at least two applications to be recommended which are matched with the user to be recommended based on the user identification;
extracting the user characteristics of the user to be recommended and the attribute characteristics of the at least two application programs to be recommended;
inputting the user characteristics and the attribute characteristics into a pre-trained recommendation model to obtain the exposure conversion rate of each matched application program to be recommended by the user to be recommended;
recommending at least one application program to be recommended in the at least two application programs to be recommended to a user to be recommended as an exposed application program based on the exposure conversion rate.
Optionally, before receiving an application recommendation request of an exposure of a user to be recommended, the method further includes:
acquiring a plurality of application programs carrying identifiers;
and screening the plurality of application programs based on a first preset condition, and determining at least two application programs to be recommended.
Optionally, after determining at least two applications to be recommended, the method further includes:
and matching the user to be recommended with the at least two application programs to be recommended based on a preset matching rule, wherein the user to be recommended carries a user identifier.
Optionally, the recommendation model includes a deep fm multitask learning model, and the deep fm multitask learning model includes a click rate estimation part and a purchase rate estimation part.
Optionally, recommending at least one application to be recommended of the at least two applications to be recommended as an exposed application to a user to be recommended based on the exposure conversion includes:
sorting the at least two applications to be recommended based on the exposure conversion rate;
and selecting at least one to-be-recommended application program in the at least two to-be-recommended application programs based on a preset recommendation condition to be recommended to the to-be-recommended user as an exposed application program.
Optionally, after the sorting the at least two applications to be recommended based on the exposure conversion rate, the method further includes:
screening the at least two applications to be recommended based on the second preset condition;
selecting at least one application program to be recommended from the at least two application programs to be recommended as an exposed application program to be recommended to a user to be recommended based on a preset recommendation condition, wherein the recommendation comprises the following steps:
and selecting at least one application program to be recommended from the at least two application programs to be recommended after screening as an exposed application program to be recommended to the user to be recommended based on a preset recommendation condition.
Optionally, the user features and the attribute features both include offline features and real-time features, where the offline features include collected historical features of the user to be recommended and the application to be recommended, and the real-time features include collected features of the user to be recommended and the application to be recommended at the current time.
In a third aspect, an embodiment of the present specification further provides a training apparatus for a recommendation model, including:
a first obtaining module configured to obtain user characteristics of at least two sample users and attribute characteristics of at least two sample applications;
a generating module configured to generate a positive sample clicked and purchased by a sample user on the exposed sample application program and a negative sample clicked but not purchased or clicked by the sample user on the exposed sample application program based on the user characteristics and the attribute characteristics;
the recommendation model is obtained by training a recommendation model based on a sample set comprising at least one positive sample and one negative sample, and the recommendation model outputs exposure conversion rate obtained by each sample user based on click rate and purchase rate of a sample application program of each exposure.
Optionally, the apparatus further comprises:
a first screening module configured to screen the sample set into a training sample set including at least one positive sample and one negative sample and a test sample set including at least one positive sample and one negative sample based on a preset screening rule.
Optionally, the training module is further configured to:
training a recommendation model based on the set of training samples comprising at least one positive sample and one negative sample.
Optionally, the apparatus further comprises:
a testing module configured to test the recommendation model based on a set of test samples comprising at least one positive sample and one negative sample.
Optionally, the recommendation model includes a deep fm multitask learning model, and the deep fm multitask learning model includes a click rate estimation part and a purchase rate estimation part.
Optionally, the training module comprises:
the first output sub-module is configured to output the click rate of each sample user to each exposed sample application program by the click rate estimation part of the recommendation model;
a second output sub-module configured to output a purchase rate of each sample user for each exposed sample application program by a purchase rate estimation part of the recommendation model;
a determination submodule configured to determine an exposure conversion rate for each sample user for each exposed sample application based on the click through rate and the purchase rate.
Optionally, the user features and the attribute features each include offline features and real-time features, wherein the offline features include historical features of the collected sample users and the sample applications, and the real-time features include features of the collected sample users and the sample applications at the time of events.
In a fourth aspect, an embodiment of the present specification further provides a recommendation device, including:
the recommendation system comprises a receiving module, a recommendation module and a recommendation module, wherein the receiving module is configured to receive a recommendation request of a user to be recommended for an exposed application program, and the user to be recommended carries a user identifier;
the determining module is configured to determine at least two applications to be recommended which are matched with the user to be recommended based on the user identification;
the extraction module is configured to extract the user characteristics of the user to be recommended and the attribute characteristics of the at least two application programs to be recommended;
the obtaining module is configured to input the user characteristics and the attribute characteristics into a pre-trained recommendation model to obtain exposure conversion rate of each matched application program to be recommended by the user to be recommended;
the first recommending module is configured to recommend at least one application program to be recommended in the at least two applications programs to be recommended to a user to be recommended as an exposed application program based on the exposure conversion rate.
Optionally, the apparatus further comprises:
the second acquisition module is configured to acquire a plurality of application programs carrying the identifiers;
the second screening module is configured to screen the plurality of application programs based on a first preset condition, and determine at least two application programs to be recommended.
Optionally, the apparatus further comprises:
the matching module is configured to match a user to be recommended with the at least two application programs to be recommended based on a preset matching rule, wherein the user to be recommended carries a user identifier.
Optionally, the recommendation model includes a deep fm multitask learning model, and the deep fm multitask learning model includes a click rate estimation part and a purchase rate estimation part.
Optionally, the first recommending module includes:
a ranking submodule configured to rank the at least two applications to be recommended based on the exposure conversion;
and the second recommending submodule is configured to select at least one to-be-recommended application program in the at least two to-be-recommended application programs to be recommended to the to-be-recommended user as an exposed application program based on a preset recommending condition.
Optionally, the apparatus further comprises:
the third screening module is configured to screen the at least two applications to be recommended based on the second preset condition;
a second recommendation sub-module further configured to:
and selecting at least one application program to be recommended from the at least two application programs to be recommended after screening as an exposed application program to be recommended to the user to be recommended based on a preset recommendation condition.
Optionally, the user characteristic and the attribute characteristic both include an offline characteristic and a real-time characteristic, where the offline characteristic includes collected historical characteristics of the user to be recommended and the application to be recommended, and the real-time characteristic includes collected characteristics of the user to be recommended and the application to be recommended at a current time.
In a fifth aspect, embodiments of the present specification disclose a computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing a training method for a recommendation model or steps of the recommendation method as described above when executing the instructions.
In a sixth aspect, the present specification discloses a computer readable storage medium storing computer instructions, which when executed by a processor implement the training method of the recommendation model or the steps of the recommendation method as described above.
The application provides a training method and device of a recommendation model, a recommendation method and device, a computing device and a computer readable storage medium, wherein the recommendation method comprises the steps of receiving a recommendation request of a user to be recommended for an exposed application program, wherein the user to be recommended carries a user identifier; determining at least two applications to be recommended which are matched with the user to be recommended based on the user identification; extracting the user characteristics of the user to be recommended and the attribute characteristics of the at least two application programs to be recommended; inputting the user characteristics and the attribute characteristics into a pre-trained recommendation model to obtain the exposure conversion rate of each matched application program to be recommended by the user to be recommended; recommending at least one application program to be recommended in the at least two applications programs to be recommended to a user to be recommended as an exposed application program based on the exposure conversion rate. The recommendation model based on deep learning is adopted, wherein the recommendation model is a model of a deep FM structure adopting multi-task learning, online real-time recommendation of the exposure application program is carried out based on a user to be recommended, the automatic feature cross combination capability of the recommendation model and the conversion feature after the exposure application program is clicked are effectively utilized, the feature sparsity problem is solved, and the recommendation effect of the exposure application program is greatly improved.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if," as used herein, may be interpreted as "at \8230; \8230when" or "when 8230; \823030when" or "in response to a determination," depending on the context.
First, the noun terms to which one or more embodiments of the present invention relate are explained.
FM: english is called as a whole: factorization Machines, all Chinese called: a factorization machine is a machine learning algorithm based on matrix decomposition and is proposed by Steffen Rendle, and can predict any real-value vector. The method has the main advantages that 1) the method can be used for highly sparse data scenes; 2) With linear computational complexity. The main objective in this application is to solve the problem of how to combine features in case of sparse data.
DNN: english is called as a whole: deep Neural Network, which is known as Chinese: the deep neural network is divided from the DNN according to the positions of different layers, and the neural network layers in the DNN can be divided into three types, namely an input layer, a hidden layer and an output layer.
DeepFM model: a deep learning model is a neural network framework integrating FM and DNN, combines the advantages of DNN and FM, and can simultaneously extract the combined features of low order and high order.
CTR: english is called as a whole: click-Through-Rate, chinese full name: click through rate.
CVR: english is called as a whole: conversion Rate, chinese full name: the rate of purchase.
MTL: english is called as a whole: multi-Task Learning, chinese full name: multitask learning, an inductive migration mechanism, is mainly aimed at improving generalization capability by using domain-specific information implicit in the training signals of multiple related tasks, and is achieved by training multiple tasks in parallel using shared representation.
TL: english is called as a whole: transfer Learning, chinese full name: the transfer learning is to transfer the knowledge in one domain (i.e. the source domain) to another domain (i.e. the target domain) so that the target domain can obtain better learning effect.
LR: english is called as a whole: logitics Regression, chinese full name: and (6) performing logistic regression.
GBDT: english is called as a whole: the Gradient Boosting Decision Tree is named in Chinese: the decision tree is boosted by a gradient.
In the present application, a training method and apparatus for a recommendation model, a recommendation method and apparatus, a computing device and a computer readable storage medium are provided, which are described in detail in the following embodiments one by one.
Fig. 1 is a block diagram illustrating a configuration of a computing device 100 according to an embodiment of the present specification. The components of the computing device 100 include, but are not limited to, memory 110 and processor 120. The processor 120 is coupled to the memory 110 via a bus 130 and a database 150 is used to store data.
Computing device 100 also includes access device 140, access device 140 enabling computing device 100 to communicate via one or more networks 160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 140 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above components of the computing device 100 and other components not shown in fig. 1 may also be connected to each other, for example, through a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 1 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.
Computing device 100 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 100 may also be a mobile or stationary server.
Wherein the processor 120 may perform the steps of the method shown in fig. 2. Fig. 2 shows a schematic flow chart of a training method of a recommendation model according to an embodiment of the present specification, including step 202 to step 206.
Step 202: user characteristics of at least two sample users and attribute characteristics of at least two sample applications are obtained.
For example, the user characteristics of at least two sample users and the attribute characteristics of at least two sample applications within a preset time period may be obtained, where the preset time period may include 60 days, 120 days, and the like, and the preset time period is set according to actual requirements, which is not limited in this application.
In practical applications, the sample user and the sample application may carry a user identifier and an application identifier, where the identifier is unique identification information for distinguishing each sample user from each sample application, for example, a unique special character string or a special mark set for each sample user or each sample application and playing a role in identification.
The sample user is a sample user of a recommended sample application, including but not limited to an office-type, game-type, entertainment-type sample application.
The user features and the attribute features each include offline features and real-time features, wherein the offline features include historical features of the collected sample users and the sample applications, and the real-time features include features of the collected sample users and the sample applications at the time of events.
Wherein, the offline characteristics of the user characteristics include but are not limited to basic portrait-like characteristics of the user, such as the user's age, sex, constellation, occupation, education level, life stage, etc.; the wealth characteristics of the user, such as income, purchasing power, the probability of having a room, the probability of having a car and the like of the user; location characteristics of the user, such as the user's place of birth, place of employment, place of home, and premise, etc.; user behavior characteristics such as the exposure number, click number and click rate of the user to the application program and other characteristics such as interest preference characteristics of the user, search question or query characteristics, activeness characteristics, historical transaction characteristics, real-time red envelope characteristics and the like. Real-time features of the user features include, but are not limited to: the scene characteristics of the user, such as the user channel source, etc., that is, in practical applications, the sample application program that the sample user likes can be determined according to the skip channel source of the sample user.
The offline features of the attribute features include, but are not limited to, basic attribute features of the application, such as classification, price, rating, number of comments, ranking, language, etc., statistical class features of the application, such as exposure pv (pageview, exposure), exposure uv (exposure), click pv (click count), click uv (click count), pv click rate, uv click rate (click count/exposure count), etc., for approximately 1/3/7/15/30/90 days of the application. Real-time features of the attribute features include, but are not limited to: scene characteristics such as the current hour and week of the application, that is, in practical applications, the sample application that is preferred by the sample user can be determined according to the current time and week.
In practical applications, the sample application may be obtained from an ODPS (Open Data Processing Service), in which the original full amount of applications is stored.
Step 204: based on the user features and the attribute features, positive samples clicked and purchased by the sample user on the exposed sample application and negative samples clicked but not purchased or clicked by the sample user on the exposed sample application are generated.
Wherein the exposed sample application is a sample application that is exposed to a sample user.
In order to estimate the exposure conversion rate of each sample user to each exposure sample application program, 60-day sample users and sample application programs clicked or not clicked after exposure for the sample users can be obtained, then analysis is performed according to user characteristics of the sample users and attribute characteristics of the exposure sample application programs, positive and negative samples (label) clicked after exposure and positive and negative samples (label) after exposure are formed, then a sample set comprising at least one positive sample and one negative sample is generated after the user characteristics and the attribute characteristics are spliced, and each piece of positive and negative sample data in the sample set can be expressed as (features, label).
In practical applications, the post-exposure click positive-negative sample and the post-exposure conversion positive-negative sample (label) may be generated by combining the user identifier (user _ id) of the sample user and the application identifier (item _ id) of the sample application, and the post-exposure click positive-negative sample (y 1= 1) is generated if the sample application is exposed and the sample user clicks, and the post-exposure click negative-sample (y 1= 0) is generated if the sample application is exposed but the sample user does not click, and the post-exposure conversion positive sample (y 2= 1) is generated if the sample application is exposed and the sample user clicks and purchases, and the post-exposure sample user does not click or purchases after clicking is generated as the exposure conversion negative-sample (y 2= 0), which may be expressed as (user _ id, item _ id, y1, y 2).
And then, after the user characteristic and the attribute characteristic are spliced according to the user _ id and the itemID, generating a sample set comprising at least one positive sample and one negative sample, wherein each piece of positive and negative sample data in the sample set can be represented as (user _ id, itemID, features, y1, y 2).
Step 206: training a recommendation model based on a sample set comprising at least one positive sample and one negative sample to obtain the recommendation model, wherein the recommendation model outputs exposure conversion rate obtained by each sample user based on click rate and purchase rate of each exposed sample application program.
In one or more embodiments of the present specification, before training the recommendation model based on a sample set including at least one positive sample and one negative sample, the method further includes:
and screening the sample set into a training sample set comprising at least one positive sample and one negative sample and a test sample set comprising at least one positive sample and one negative sample based on a preset screening rule.
The preset screening rule includes, but is not limited to, selecting positive and negative sample data of a preset number in a preset time length from the generated sample set as a test sample set, and taking the rest of the positive and negative sample data as a training sample set. For example, the preset screening rule may include extracting 20 ten thousand positive and negative sample data in the positive and negative sample data of the last day from the generated 1.4 hundred million sample sets as a test sample set, and using the rest positive and negative sample data as a training sample set. In practical application, because the ratio of exposure to conversion of the application program is low, if the number of positive and negative samples in the test sample set is too small, the number of positive samples converted by exposure of the application program is small, and the evaluation may be inaccurate, so that it is more appropriate to select 150 ten thousand positive and negative samples as the test sample set.
The recommendation model is obtained by training a recommendation model based on a sample set comprising at least one positive sample and one negative sample, the recommendation model outputs exposure conversion rate obtained by each sample user based on click rate and purchase rate of a sample application program for each exposure, namely the recommendation model is obtained by training the recommendation model based on the training sample set comprising at least one positive sample and one negative sample, and the recommendation model outputs exposure conversion rate obtained by each sample user based on click rate and purchase rate of the sample application program for each exposure.
In practical applications, after training the recommendation model based on the sample set including at least one positive sample and one negative sample, the method further includes:
the recommendation model is tested based on a set of test samples including at least one positive sample and one negative sample.
The recommendation model is tested based on a test sample set comprising at least one positive sample and one negative sample, namely, the test sample set is input into the recommendation model, so that each sample user of the recommendation model obtains exposure conversion rate based on click rate and purchase rate of each exposed sample application program. Wherein the recommending model outputs the exposure conversion rate obtained by each sample user based on the click rate and the purchase rate of the sample application program for each exposure, and comprises:
the click rate estimation part of the recommendation model outputs the click rate of each sample user to each exposed sample application program;
a purchase rate estimation part of the recommendation model outputs the purchase rate of each sample user to each exposed sample application program;
determining an exposure conversion rate for each sample user for each exposed sample application based on the click through rate and the purchase rate.
The recommendation model includes, but is not limited to, a deep fm multitask learning model, and in one or more embodiments of the present specification, the recommendation model is described as the deep fm multitask learning model.
The deep FM multi-task learning model is to apply multi-task learning MTL in the deep FM model, wherein the deep FM model combines the advantages of DNN and FM and can simultaneously extract the combined features of low order and high order. Wherein, the FM part extracts low order combination features, including: linear combination of first-order features (weight and feature dot product), second-order cross features (implicit vector inner product). The Deep part extracts high-order combination characteristics. Meanwhile, FM and Deep share the input and embedding vectors. Specifically, the prediction result of the deep fm model is expressed by formula (1):
wherein, the output formula of FM is expressed as formula (2):
the output formula of DNN is expressed as formula (3):
yDN N=σ(W |H|+1 ·a H +b |H|+1 ) (3)
as shown in fig. 3, a network structure of the Deep FM model is provided, the Deep FM model is first divided into a Deep neural network part and an FM factorizer part, the Deep neural network part may adopt a fully-connected feedforward neural network DNN, the DNN and the FM divide input user features and attribute features into a plurality of feature groups, each feature group corresponds to one embedding (embedding) vector, wherein a feature splicing layer (concat) of the Deep neural network part splices all the embedding vectors, and then a fully-connected layer (Fc (relu)) of two layers is added to realize combination of high-order features; the FM factorization machine carries out weighted summation (addition) on input original feature input such as user features and attribute features, and extracts feature combinations through an embedding vector inner product of each dimension to realize the combination of low-order features; and finally, combining the outputs of the Deep neural network and the FM factorization machine to obtain a prediction result (sigmoid), namely a click rate prediction result.
The multi-task learning MTL enables the DeepFM model to better summarize the original task by sharing the characterization between related tasks. It is also an inductive migration mechanism, the main goal being to improve generalization capability with domain-specific information implicit in the training signals of multiple related tasks, multi-task learning accomplishes this goal by training multiple tasks in parallel using a shared representation.
In practical applications, the recommended application program for the user may go through a plurality of steps, such as exposure, clicking, and final purchase. The goal of using the deep fm multitask learning model was to increase the final uv exposure conversion (CTCVR = number of purchases/number of exposures). The calculation formula of the exposure conversion rate (CTCVR) is shown in formula (4), and the calculation formula is obtained by multiplying the CTR (click rate estimation) and the CVR (purchase rate estimation).
If a recommendation method for estimating the click rate by adopting a positive and negative sample training click rate (CTR) recommendation model of clicking after exposure (namely, sorting is carried out according to pCTR during recommendation), conversion data after clicking is not fully utilized, the click rate of some application programs is high, but the purchase rate after clicking is very low, so that the total CTCVR is not high. In addition, a CVR model is trained to predict the purchase rate of the user after clicking, and then pCTCVR is obtained from pCTR pCVR for sorting. However, the direct training of the CVR model only uses a sample set clicked by a user, the positive and negative sample data volume is usually one to two orders of magnitude less than that of the exposed sample set, and the sample data sparsity problem is faced.
Referring to fig. 4, in an embodiment of the present description, a deep fm multitask learning model is used. The DeepFM multi-task learning model is mainly divided into a CTR part and a CVR part, the DeepFM multi-task learning model divides input bottom-layer feature features into a plurality of feature groups, each feature group corresponds to an embedding (embedding) vector, and the CTR part and the CVR part share the bottom-layer feature features and the embedding vectors. From fig. 4, it can be seen that the CTR and the CVR are each a deep fm model, and all the embedding vectors are embedded by using a feature embedding layer (concat), and then a full connection layer (Fc (relu)) of two layers is added to realize combination of higher-order features and feature combination is extracted by embedding vector inner product of each dimension to realize combination of lower-order features, so as to obtain a click rate prediction result and a purchase rate prediction result, wherein the purchase rate is a purchase rate after exposure, and finally the click rate prediction result and the purchase rate prediction result are multiplied to obtain an exposure conversion rate prediction result (i.e., pCTCVR). The penalty function for the DeepFM multitask learning model is shown in equation (5), where yi denotes click or not, zi denotes buy or not, which optimizes both the exposure click rate and the exposure purchase rate, and pCVR is only an intermediate node of the DeepFM multitask learning model. And finally, when recommendation is performed, sorting in a reverse order according to the pCTCVR, and recommending the top N exposed application programs to the user.
In one or more embodiments of the present specification, a method for training a recommendation model is provided, where user characteristics of at least two sample users and attribute characteristics of at least two sample applications are obtained; generating a positive sample clicked and purchased by the sample user on the exposed sample application program and a negative sample clicked but not purchased or clicked by the sample user on the exposed sample application program based on the user characteristics and the attribute characteristics; training a recommendation model based on a sample set comprising at least one positive sample and one negative sample to obtain the recommendation model, wherein the recommendation model outputs exposure conversion rate obtained by each sample user based on click rate and purchase rate of each exposed sample application program. When the deep learning-based DeepFM multi-task learning model is trained, data such as exposed application programs purchased or not clicked by sample users after clicking, automatic feature cross combination capability of the DeepFM multi-task learning model and conversion features of the exposed application programs after clicking are fully utilized, the problem of feature sparsity is solved, and the recommendation effect of the exposed application programs is greatly improved.
The processor 120 may also perform the steps of the method shown in fig. 5. Fig. 5 shows a schematic flow chart of a recommendation method according to an embodiment of the present specification, including steps 502 to 510.
Step 502: receiving a recommendation request of a user to be recommended for an exposed application program, wherein the user to be recommended carries a user identifier.
The user to be recommended is a user of the application program to be recommended and exposed; the exposed application is the application which is exposed and recommended to the user to be recommended.
The user identifier is unique identification information for distinguishing each user to be recommended, for example, a unique special character string or a special mark which is set for each user to be recommended and has an identification function.
In one or more embodiments of the present specification, before receiving an application recommendation request for exposure of a user to be recommended, the method further includes:
acquiring a plurality of application programs carrying identifications;
and screening the plurality of application programs based on a first preset condition, and determining at least two application programs to be recommended.
The application program to be recommended is an application program waiting to be recommended to the user. The first preset condition includes, but is not limited to, selecting an application program within a preset interval, for example, the preset interval is 1 to 200, and the first preset condition may be selecting an application program with a price within an interval of 1 to 200. The first preset condition may further include rejecting yellow gambling poison related applications or poor quality applications, and the like, wherein the poor quality applications may be determined by the scores and the number of comments of the applications, for example, lower limit values of the scores and the number of comments are set, and applications with a score less than 3 and a score less than 200 are identified as the poor quality applications.
The first preset condition may be set according to an actual requirement, which is not limited in this application. After the application programs are screened by the first preset condition, the number of the application programs is reduced, and the recommendation quality of the application programs is guaranteed while the subsequent workload is reduced.
Taking the first preset condition as an example of selecting the application programs with the price in the range of 1 to 200 yuan, screening the plurality of application programs based on the first preset condition, and determining at least two application programs to be recommended, namely screening the application programs with the price in the range of 1 to 200 yuan as the application programs to be recommended.
In practical application, all application programs are placed in the ODPS, the problem of mass data calculation of a user can be solved more quickly, a plurality of application programs to be recommended can be selected quickly from the plurality of application programs, enterprise cost can be effectively reduced, and data safety is guaranteed.
Step 504: and determining at least two applications to be recommended matched with the user to be recommended based on the user identification.
The application program to be recommended may be regarded as an initial application program displayed to the user, and the exposure conversion rate of each application program to be recommended may be obtained when performing recommendation model prediction subsequently.
Specifically, the method for screening the plurality of application programs based on a first preset condition and after determining at least two application programs to be recommended further includes:
and matching the user to be recommended with the at least two application programs to be recommended based on a preset matching rule, wherein the user to be recommended carries a user identifier.
The preset matching rules include, but are not limited to, recommending top-ranked popular applications to be recommended in each category (denoted hot), recommending applications to be recommended according to preference of each user for the Taobao category (denoted U2C 2I), grouping users according to gender/age/city/purchasing power/interest tags, recommending applications to be recommended clicked by the same group of users (denoted U2G 2I), and/or recommending applications to be recommended similar to applications clicked by the users (denoted Item-CF).
Taking the preset matching rule as an example of recommending top-ranked application programs of each category, and matching the user to be recommended with the at least two application programs to be recommended based on the preset matching rule; namely, based on the user identification and the identification of the application program, the popular application program named next ten in the game category recommended by the user to be recommended is the application program to be recommended, so that the user to be recommended is matched with the recommended application program to be recommended.
After the preset matching rules are matched, each user to be recommended corresponds to dozens or hundreds of application programs to be recommended, and then the application programs to be recommended matched with each user to be recommended are recorded in a database, such as an HBase database, based on the user identification of each user to be recommended and the identification of the application programs to be recommended. After a recommendation request of an exposure application program of a user to be recommended is received, the application program to be recommended matched with the user identification of the user to be recommended can be inquired in the HBase database based on the user identification of the user to be recommended.
Step 506: and extracting the user characteristics of the user to be recommended and the attribute characteristics of the at least two application programs to be recommended.
The user characteristics and the attribute characteristics comprise offline characteristics and real-time characteristics, wherein the offline characteristics comprise collected historical characteristics of the user to be recommended and the application program to be recommended, and the real-time characteristics comprise collected characteristics of the user to be recommended and the application program to be recommended at the current moment.
The extracted user features and the extracted attribute features may refer to the above embodiments, which are not described herein again.
Step 508: and inputting the user characteristics and the attribute characteristics into a pre-trained recommendation model to obtain the exposure conversion rate of the user to be recommended to each matched application program to be recommended.
The recommendation model comprises a deep FM multi-task learning model, and the deep FM multi-task learning model comprises a click rate estimation part and a purchase rate estimation part.
In practical application, the extracted user characteristics and the offline characteristics of the attribute characteristics are synchronized into an HBase database, when the recommendation model predicts the exposure conversion rate of the user to be recommended to each matched application program to be recommended, the user characteristics of the user to be recommended and the offline characteristics of each matched application program to be recommended can be directly extracted from the HBase database in real time, then the exposure conversion rate prediction is performed by combining the real-time extracted real-time characteristics, when the exposure conversion rate of the user to be recommended to each matched application program to be recommended is predicted by adopting the recommendation model, the attribute characteristics are recorded to form a characteristic log and flow back to an ODPS, then the recommendation model is trained in an offline manner based on the characteristic log, and the trained recommendation model is updated to realize continuous optimization of the recommendation model.
Step 510: recommending at least one application program to be recommended in the at least two applications programs to be recommended to a user to be recommended as an exposed application program based on the exposure conversion rate.
In one or more embodiments of the present specification, recommending at least one exposed application of the at least two exposed applications to the user to be recommended based on the exposure conversion includes:
in one or more embodiments of the present specification, recommending, to a user to be recommended, at least one application to be recommended of the at least two applications to be recommended as an exposed application based on the exposure conversion rate includes:
sorting the at least two applications to be recommended based on the exposure conversion rate;
and selecting at least one to-be-recommended application program in the at least two to-be-recommended application programs based on a preset recommendation condition to be recommended to the to-be-recommended user as an exposed application program.
The sorting comprises but is not limited to descending sorting, and the preset recommendation condition comprises but is not limited to selecting the application to be recommended with the top 30 rank.
In practical application, the at least two application programs to be recommended are sorted based on the exposure conversion rate;
and selecting at least one to-be-recommended application program in the at least two to-be-recommended application programs based on a preset recommendation condition to be recommended to the to-be-recommended user as an exposed application program.
That is, the at least two applications to be recommended may be sorted in a descending order based on the exposure conversion rate, and then the application to be recommended that is ranked first 30 is recommended to the user to be recommended as the exposed application, and the application to be recommended that is ranked first 30 is the application that is actually exposed and recommended to the user.
In another implementation manner, after the sorting the at least two applications to be recommended based on the exposure conversion rate, the method further includes:
and screening the at least two applications to be recommended based on the second preset condition.
The second preset condition may include, but is not limited to, selecting and eliminating applications in a preset blacklist.
Under the condition that the at least two applications to be recommended are screened based on the second preset condition, selecting at least one application to be recommended from the at least two applications to be recommended as an exposed application to be recommended to a user to be recommended based on a preset recommendation condition includes:
and selecting at least one application program to be recommended from the at least two application programs to be recommended after screening as an exposed application program to be recommended to the user to be recommended based on a preset recommendation condition.
The at least two application programs to be recommended are sorted in a descending order based on the exposure conversion rate, then the at least two application programs to be recommended after the sorting in the descending order are matched with the application programs in a preset blacklist, if the matched application programs to be recommended exist, the application programs to be recommended are removed from the queue in the descending order, then the first 30 or the first 20 application programs to be recommended with the highest exposure conversion rate are selected as the exposure application programs to be recommended to the users to be recommended, so that the optimal application programs to be recommended are recommended to the users to be recommended as the exposure application programs, and the user experience is improved.
In actual use, when an exposed application program recommendation request of a user to be recommended is received, firstly, an application program to be recommended which is matched with the user to be recommended after being screened is obtained, then, the attribute characteristics of the application program to be recommended and the user characteristics of the matched user to be recommended are input into a pre-trained recommendation model, and the exposure conversion rate of the user to be recommended to each application program to be recommended is obtained.
And finally, sequencing the application programs to be recommended corresponding to the users to be recommended based on the exposure conversion rate, and selecting the application programs to be recommended with the ranking of 50 th or 60 th as the final exposed application program to be recommended based on the requirement and finally displaying the final exposed application program to the users to be recommended.
According to the recommendation method provided by one or more embodiments of the specification, firstly, obtained application programs are screened, namely, selected products are filtered, high-quality application programs to be recommended are screened out, then strategies such as hot/U2C2I/U2G2I/Item-CF are utilized to match a user with the application programs to be recommended, the problem that a tag system is established by using a large amount of manpower is avoided, the application programs to be recommended which are possibly clicked by the user can be better covered by various matching strategies, online real-time exposure conversion rate prediction is carried out by adopting a deep learning recommendation model according to the user to be recommended and the matched application programs to be recommended, the appropriate application programs to be recommended are recommended to the user to be recommended as final exposed application programs through exposure conversion rate, real-time characteristics are effectively utilized, and the recommendation effect is improved.
In practical application, the trained recommendation model in the description needs to be deployed to an online server to perform online real-time scoring on an application program. Generally, an arks platform is adopted, which provides high-performance online sequencing and real-time pre-estimation service, is highly available, and realizes multiple functions such as load balancing, remote disaster recovery and the like. When a user requests application program recommendation, a retrieval module is required to retrieve hundreds of candidate application programs, namely application programs to be recommended, which are previously matched for the user in a matching stage from an HBase according to user identification user _ id. And the offline and real-time characteristics of the user and the application programs to be recommended are scored for the recommendation model in real time, so that the exposure conversion rate of the user to each application program to be recommended is obtained. And finally, sorting in a descending order according to the exposure conversion rate of the application programs to be recommended, and recommending and displaying the 30 application programs to be recommended which are ranked at the top as final exposed application programs to the user. In addition, in order to avoid recommending some unsuitable exposed applications to the user, a blacklist filtering mechanism can be further arranged, and some exposed applications of badcase can be filtered out urgently.
Referring to fig. 6, one or more embodiments of the present specification provide a training apparatus for a recommendation model, including:
a first obtaining module 602 configured to obtain user characteristics of at least two sample users and attribute characteristics of at least two sample applications;
a generating module 604 configured to generate a positive sample clicked and purchased by a sample user on the exposed sample application and a negative sample clicked but not purchased or clicked by the sample user on the exposed sample application based on the user characteristics and the attribute characteristics;
a training module 606 configured to train a recommendation model based on a sample set comprising at least one positive sample and one negative sample, resulting in the recommendation model, the recommendation model outputting an exposure conversion rate for each sample user based on a click through rate and a purchase rate for a sample application for each exposure.
Optionally, the apparatus further comprises:
a first screening module configured to screen the set of samples into a training sample set including at least one positive sample and one negative sample and a test sample set including at least one positive sample and one negative sample based on a preset screening rule.
Optionally, the training module is further configured to:
training a recommendation model based on the set of training samples comprising at least one positive sample and one negative sample.
Optionally, the apparatus further comprises:
a testing module configured to test the recommendation model based on a set of test samples including at least one positive sample and one negative sample.
Optionally, the recommendation model includes a deep fm multitask learning model, and the deep fm multitask learning model includes a click rate estimation part and a purchase rate estimation part.
Optionally, the training module 606 comprises:
the first output sub-module is configured to output the click rate of each sample user to each exposed sample application program by the click rate estimation part of the recommendation model;
a second output sub-module configured to output a purchase rate of each sample user for each exposed sample application program by a purchase rate estimation part of the recommendation model;
a determination sub-module configured to determine an exposure conversion rate for each sample user for a sample application for each exposure based on the click-through rate and the purchase rate.
Optionally, the user characteristics and the attribute characteristics each include offline characteristics and real-time characteristics, wherein the offline characteristics include historical characteristics of the collected sample users and the sample applications, and the real-time characteristics include characteristics of the collected sample users and the sample applications at the time of events.
When the training device for the recommendation model provided by one or more embodiments of the specification trains the multitask learning model based on the deep learning model deep fm, data such as an exposed application program purchased or not clicked by a sample user after clicking, automatic feature cross combination capability of the multitask learning model of deep fm and conversion features of the exposed application program after clicking are fully utilized, the problem of feature sparsity is solved, and the recommendation effect of the exposed application program is greatly improved.
The above is a schematic scheme of a training apparatus for recommending a model according to this embodiment. It should be noted that the technical solution of the training apparatus for the recommended model and the technical solution of the training method for the recommended model belong to the same concept, and details that are not described in detail in the technical solution of the training apparatus for the recommended model can be referred to the description of the technical solution of the training method for the recommended model.
Referring to fig. 7, one or more embodiments of the present specification further provide a recommendation apparatus including:
a receiving module 702, configured to receive a recommendation request of a user to be recommended for an exposed application program, where the user to be recommended carries a user identifier;
a determining module 704, configured to determine at least two applications to be recommended that match the user to be recommended based on the user identification;
an extracting module 706 configured to extract the user characteristics of the user to be recommended and the attribute characteristics of the at least two applications to be recommended;
an obtaining module 708, configured to input the user characteristics and the attribute characteristics into a pre-trained recommendation model, and obtain an exposure conversion rate of each matched application program to be recommended by the user to be recommended;
a first recommending module 710 configured to recommend at least one application to be recommended of the at least two applications to be recommended as an exposed application to a user to be recommended based on the exposure conversion rate.
Optionally, the apparatus further comprises:
the second acquisition module is configured to acquire a plurality of application programs carrying the identifiers;
the second screening module is configured to screen the plurality of application programs based on a first preset condition, and determine at least two application programs to be recommended.
Optionally, the apparatus further comprises:
the matching module is configured to match a user to be recommended with the at least two application programs to be recommended based on a preset matching rule, wherein the user to be recommended carries a user identifier.
Optionally, the recommendation model includes a deep fm multitask learning model, and the deep fm multitask learning model includes a click rate estimation part and a purchase rate estimation part.
Optionally, the first recommending module 710 includes:
a ranking submodule configured to rank the at least two applications to be recommended based on the exposure conversion;
and the second recommending submodule is configured to select at least one to-be-recommended application program in the at least two to-be-recommended application programs to be recommended to the to-be-recommended user as an exposed application program based on a preset recommending condition.
Optionally, the apparatus further comprises:
the third screening module is configured to screen the at least two applications to be recommended based on the second preset condition;
a second recommendation sub-module further configured to:
and selecting at least one application program to be recommended from the at least two application programs to be recommended after screening as an exposed application program to be recommended to the user to be recommended based on a preset recommendation condition.
Optionally, the user features and the attribute features both include offline features and real-time features, where the offline features include collected historical features of the user to be recommended and the application to be recommended, and the real-time features include collected features of the user to be recommended and the application to be recommended at the current time.
According to the recommendation device provided by one or more embodiments of the specification, firstly, obtained application programs are screened, namely, selected products are filtered, high-quality application programs to be recommended are screened out, then strategies such as hot/U2C2I/U2G2I/Item-CF are utilized to match a user with the application programs to be recommended, the problem that a tag system is established by using a large amount of manpower is avoided, the application programs to be recommended which are possibly clicked by the user can be better covered by various matching strategies, online real-time exposure conversion rate prediction is carried out by adopting a deep learning recommendation model according to the user to be recommended and the matched application programs to be recommended, the appropriate application programs to be recommended are recommended to the user to be recommended as final exposed application programs through exposure conversion rate, real-time characteristics are effectively utilized, and the recommendation effect is improved.
The above is a schematic solution of a recommendation device of the present embodiment. It should be noted that the technical solution of the recommendation apparatus and the technical solution of the recommendation method described above belong to the same concept, and for details that are not described in detail in the technical solution of the recommendation apparatus, reference may be made to the description of the technical solution of the recommendation method described above.
The matching rules, user characteristics and the deep fm multitask learning model provided in the specification were compared online using the AB test platform of the darwinian laboratory. The following comparisons are obtained as results, and the main comparison indexes are as follows: UV click rate (number of clicks/number of exposures) and UV exposure conversion rate (number of purchasers/number of exposures).
1. Matching rules experiment
Compared with the common matching by using Hot and U2C2I matching rules, the UV click rate is improved by 9.93% (15.11% - > 16.61%).
Compared with common matching of Hot and U2C2I, the UV click rate is improved by 10.86% (28.18% - > 31.25%) when the Hot, U2C2I, U2G2I and Item-CF matching rules are matched together.
Experiments fully show that the UV click rate can be obviously improved by adopting a plurality of matching rules for matching.
2. Characteristic experiment
With the increased interest-biased feature, the exposure conversion was increased by 1.93% (15.78% - > 16.08%) over the no interest-biased feature. The click purchase behavior of the user is explained to be closely related to the interest preference of the user.
3. Model experiment
Firstly, the CTR model and the fractional multiplication mode of two independent CTR models and the CVR model are compared in an experiment, both the CTR model and the CVR model adopt DeepFM, and the model structure and the input characteristics are completely the same. The data of the experimental results are shown in table 1, and it can be seen from table 1 that the UV click rate is decreased and the purchase rate after UV click is increased by multiplying the scores of two separate CTR and CVR models, but the final UV exposure conversion rate is worse than that of the CTR model alone.
TABLE 1
And performing an online comparison experiment on the CTR model and the DeepFM-based multitask learning model, wherein input characteristics, the DeepFM model structure and the sample data size of the two models are completely the same. The experimental results are shown in table 2 below, and based on the deepFM-based CTR model, it can be seen that the DeepFM-based multitask learning model is 3.73% worse in UV click rate than the CTR model, the purchase rate after UV click is increased by 5.01%, and the total UV exposure conversion rate is increased by 1.09%. Experiments fully show that the DeepFM-based multitask learning model can bring about the improvement of the conversion rate of UV exposure.
TABLE 2
An embodiment of the present application further provides a computer-readable storage medium, which stores computer instructions, when executed by a processor, for implementing the method for training a recommendation model or the steps of the recommendation method as described above.
The above is an illustrative scheme of a computer-readable storage medium of the embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the above-mentioned training method of the recommendation model or the above-mentioned technical solution of the recommendation method, and details that are not described in detail in the technical solution of the storage medium can be referred to the above-mentioned description of the training method of the recommendation model or the above-mentioned technical solution of the recommendation method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U.S. disk, removable hard disk, magnetic diskette, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signal, telecommunications signal, and software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that for simplicity and convenience of description, the above-described method embodiments are described as a series of combinations of acts, but those skilled in the art will appreciate that the present application is not limited by the order of acts, as some steps may, in accordance with the present application, occur in other orders and/or concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.