CN110008399A

CN110008399A - A kind of training method and device, a kind of recommended method and device of recommended models

Info

Publication number: CN110008399A
Application number: CN201910090621.6A
Authority: CN
Inventors: 谢仁强
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2019-01-30
Filing date: 2019-01-30
Publication date: 2019-07-12
Anticipated expiration: 2039-01-30
Also published as: CN110008399B

Abstract

Training method and device, a kind of recommended method and device of a kind of recommended models provided by the present application, wherein the training method of the recommended models includes the user characteristics for obtaining at least two sample of users and the attributive character of at least two sample application programs；The positive sample that sample of users is clicked and bought to the sample application program of exposure and the negative sample that sample of users is clicked to the sample application program of exposure but do not buy or do not click on are generated based on user characteristics, attributive character；Recommended models are trained based on the sample set for including at least one positive sample and negative sample, the recommended models are obtained, the recommended models export the exposure conversion ratio that clicking rate and buying rate of each sample of users based on the sample application program to each exposure obtain.

Description

A kind of training method and device, a kind of recommended method and device of recommended models

Technical field

This application involves computer distribution type processing technology field, in particular to the training method and dress of a kind of recommended models It sets, a kind of recommended method and device, a kind of calculating equipment and computer readable storage medium.

Background technique

In some internet products, be quickly found out oneself desired commodity (such as APP) for the ease of user, need into User's most probable is clicked the commercial product recommending of purchase to user to be recommended by row personalized recommendation.

Currently, some platforms are when carrying out commercial product recommending simply according to the label of user and APP using traditional Machine learning model (such as: LR or GBDT model) is recommended, it is this by label recommended in the way of need to spend it is big Amount manpower establishes label system, and be easy to recommend unsuitable APP to user (such as: the APP of porns, gambling and drugs), and tradition Machine learning model only the clicking rate of APP is estimated, do not make full use of the purchase data after clicking, clicking rate is high APP, the conversion ratio after click are not necessarily high.

Summary of the invention

In view of this, this specification embodiment provides the training method and device, a kind of recommendation side of a kind of recommended models Method and device, a kind of calculating equipment and computer readable storage medium, to solve technological deficiency existing in the prior art.

In a first aspect, this specification embodiment discloses a kind of training method of recommended models, comprising:

Obtain the user characteristics of at least two sample of users and the attributive character of at least two sample application programs；

The positive sample that sample of users is clicked and bought to the sample application program of exposure is generated based on user characteristics, attributive character The negative sample that sheet and sample of users are clicked to the sample application program of exposure but do not buy or do not click on；

Recommended models are trained based on the sample set for including at least one positive sample and negative sample, obtain described push away Model is recommended, the recommended models export clicking rate and purchase of each sample of users based on the sample application program to each exposure The exposure conversion ratio that the rate of buying obtains.

Optionally, it is trained to recommended models based on the sample set for including at least one positive sample and negative sample Before, further includes:

Being screened the sample set based on default screening rule is the training for including at least one positive sample and negative sample Sample set and test sample set including at least one positive sample and negative sample.

Optionally, packet is trained to recommended models based on the sample set for including at least one positive sample and negative sample It includes:

Recommended models are trained based on the training sample set including at least one positive sample and negative sample.

Optionally, it is trained to recommended models based on the sample set for including at least one positive sample and negative sample Afterwards, further includes:

The recommended models are tested based on the test sample set for including at least one positive sample and negative sample.

Optionally, the recommended models include DeepFM multi-task learning model, the DeepFM multi-task learning model Part is estimated including clicking rate and buying rate estimates part.

Optionally, the recommended models export click of each sample of users based on the sample application program to each exposure The exposure conversion ratio that rate and buying rate obtain includes:

The clicking rate of the recommended models estimates the sample application program that part exports each sample of users to each exposure Clicking rate；

The buying rate of the recommended models estimates the sample application program that part exports each sample of users to each exposure Buying rate；

Determine each sample of users to the sample application program of each exposure based on the clicking rate and the buying rate Expose conversion ratio.

Optionally, the user characteristics and the attributive character include offline feature and real-time characteristic, wherein it is described from Line feature includes the sample of users of acquisition and the history feature of the sample application program, and the real-time characteristic includes adopting Feature of the sample of users and the sample application program of collection when event occurs.

Second aspect, one embodiment of this specification additionally provide a kind of recommended method, comprising:

User to be recommended is received to the recommendation request of the application program of exposure, wherein the user to be recommended carries useful Family mark；

Based on user identifier determination and the user matched at least two to be recommended application program to be recommended；

Extract the user characteristics of the user to be recommended and the attributive character of at least two application program to be recommended；

The user characteristics and the attributive character are input in recommended models trained in advance, are obtained described to be recommended Exposure conversion ratio of the user to each matched application program to be recommended；

Based on the exposure conversion ratio by the application to be recommended of at least one of described at least two application programs to be recommended Program recommends user to be recommended as the application program of exposure.

Optionally, before the application program recommendation request for receiving the exposure of user to be recommended, further includes:

Obtain multiple tagged application programs of carrying；

The multiple application program is screened based on the first preset condition, determines that at least two is to be recommended using journey Sequence.

Optionally it is determined that after at least two application programs to be recommended, further includes:

User to be recommended is matched with described at least two application programs to be recommended based on preset matching rule, In, the user to be recommended carries user identifier.

Optionally, based on the exposure conversion ratio by least one of described at least two application programs to be recommended wait push away The application program that application program is recommended as exposure recommends user to be recommended and includes:

Described at least two application programs to be recommended are ranked up based on the exposure conversion ratio；

The application to be recommended of at least one of at least two application program to be recommended is selected based on default recommendation condition Program recommends user to be recommended as the application program of exposure.

Optionally, after being ranked up based on the exposure conversion ratio to described at least two application programs to be recommended, also Include:

Described at least two application programs to be recommended are screened based on second preset condition；

The application to be recommended of at least one of at least two application program to be recommended is selected based on default recommendation condition Program recommends user to be recommended as the application program of exposure

Based at least one of described at least two application programs to be recommended after the selection screening of default recommendation condition to Application program is recommended to recommend user to be recommended as the application program of exposure.

Optionally, the user characteristics and the attributive character include offline feature and real-time characteristic, wherein it is described from Line feature includes the user to be recommended of acquisition and the history feature of the application program to be recommended, the real-time characteristic packet Include acquisition the user to be recommended and the application program to be recommended current time feature.

The third aspect, one embodiment of this specification additionally provide a kind of training device of recommended models, comprising:

First obtains module, is configured as the user characteristics for obtaining at least two sample of users and at least two samples are answered With the attributive character of program；

Generation module is configured as generating sample of users to the sample application journey of exposure based on user characteristics, attributive character Sequence is clicked and the positive sample and sample of users bought click the sample application program of exposure but do not buy or do not click on and is negative Sample；

Training module, be configured as based on include at least one positive sample and negative sample sample set to recommended models into Row training, obtains the recommended models, the recommended models export each sample of users based on the sample application to each exposure The exposure conversion ratio that the clicking rate and buying rate of program obtain.

Optionally, described device further include:

First screening module is configured as based on presetting screening rule to be by sample set screening including at least one The training sample set of positive sample and negative sample and test sample set including at least one positive sample and negative sample.

Optionally, the training module, is configured to:

Optionally, described device further include:

Test module is configured as pushing away based on the test sample set for including at least one positive sample and negative sample to described Model is recommended to be tested.

Optionally, the training module includes:

First output sub-module, the clicking rate for being configured as the recommended models estimate part and export each sample of users pair The clicking rate of the sample application program of each exposure；

Second output sub-module, the buying rate for being configured as the recommended models estimate part and export each sample of users pair The buying rate of the sample application program of each exposure；

It determines submodule, is configured as determining each sample of users to each exposure based on the clicking rate and the buying rate The exposure conversion ratio of the sample application program of light.

Fourth aspect, one embodiment of this specification additionally provide a kind of recommendation apparatus, comprising:

Receiving module is configured as receiving user to be recommended to the recommendation request of the application program of exposure, wherein it is described to Recommended user carries user identifier；

Determining module, be configured as based on the user identifier it is determining with the user matched at least two to be recommended to Recommend application program；

Extraction module is configured as extracting the user characteristics of the user to be recommended and described at least two to be recommended answers With the attributive character of program；

Module is obtained, is configured as the user characteristics and the attributive character being input to recommended models trained in advance In, the user to be recommended is obtained to the exposure conversion ratio of each matched application program to be recommended；

First recommending module, being configured as will be in described at least two application program to be recommended based on the exposure conversion ratio At least one application program to be recommended as exposure application program recommend user to be recommended.

Optionally, described device further include:

Second obtains module, is configured as obtaining multiple tagged application programs of carrying；

Second screening module is configured as screening the multiple application program based on the first preset condition, be determined At least two application programs to be recommended.

Optionally, described device further include:

Matching module is configured as being based on preset matching rule for user to be recommended and described at least two applications to be recommended Program is matched, wherein the user to be recommended carries user identifier.

Optionally, first recommending module includes:

Sorting sub-module is configured as carrying out described at least two application programs to be recommended based on the exposure conversion ratio Sequence；

Second recommends submodule, is configured as selecting at least two application program to be recommended based on default recommendation condition At least one of application program to be recommended as exposure application program recommend user to be recommended.

Optionally, described device further include:

Third screening module is configured as based on second preset condition to described at least two application programs to be recommended It is screened；

Second recommends submodule, is configured to:

5th aspect, this specification embodiment disclose a kind of calculating equipment, including memory, processor and are stored in On reservoir and the computer instruction that can run on a processor, the processor is realized when executing described instruction to be recommended as described above The step of training method of model or the recommended method.

6th aspect, this specification embodiment disclose a kind of computer readable storage medium, are stored with computer and refer to The step of order, which realizes the training method or the recommended method of recommended models as described above when being executed by processor.

The training method and device of a kind of recommended models provided by the present application, a kind of recommended method and device, a kind of calculating Equipment and computer readable storage medium, wherein the recommended method includes receiving user to be recommended to the application program of exposure Recommendation request, wherein the user to be recommended carries user identifier；Based on the user identifier it is determining with it is described to be recommended The application program to be recommended of user matched at least two；Extract the user characteristics and described at least two of the user to be recommended The attributive character of application program to be recommended；The user characteristics and the attributive character are input to recommended models trained in advance In, the user to be recommended is obtained to the exposure conversion ratio of each matched application program to be recommended；It is converted based on the exposure Rate is pushed away at least one of described at least two application programs to be recommended application program to be recommended as the application program of exposure It recommends to user to be recommended.Using the recommended models based on deep learning, wherein the recommended models are using multi-task learning The model of DeepFM structure is exposed the online real-time recommendation of application program based on user to be recommended, institute is effectively utilized The automated characterization combined crosswise ability for stating recommended models and the conversion characteristic after exposure application program click, it is dilute to solve feature Property problem is dredged, the recommendation effect of exposure application program is greatly improved.

Detailed description of the invention

Fig. 1 is a kind of structural block diagram for calculating equipment that this specification one or more embodiment provides；

Fig. 2 is a kind of flow chart of the training method for recommended models that this specification one or more embodiment provides；

Fig. 3 is the schematic network structure for the DeepFM model that this specification one or more embodiment provides；

Fig. 4 is the network structure signal for the DeepFM multi-task learning model that this specification one or more embodiment provides Figure；

Fig. 5 is a kind of flow chart for recommended method that this specification one or more embodiment provides；

Fig. 6 is a kind of structural representation of the training device for recommended models that this specification one or more embodiment provides Figure；

Fig. 7 is a kind of structural schematic diagram for recommendation apparatus that this specification one or more embodiment provides.

Specific embodiment

Many details are explained in the following description in order to fully understand the application.But the application can be with Much it is different from other way described herein to implement, those skilled in the art can be without prejudice to the application intension the case where Under do similar popularization, therefore the application is not limited by following public specific implementation.

The term used in this specification one or more embodiment be only merely for for the purpose of describing particular embodiments, It is not intended to be limiting this specification one or more embodiment.In this specification one or more embodiment and appended claims The "an" of singular used in book, " described " and "the" are also intended to including most forms, unless context is clearly Indicate other meanings.It is also understood that term "and/or" used in this specification one or more embodiment refers to and includes One or more associated any or all of project listed may combine.

It will be appreciated that though may be retouched using term first, second etc. in this specification one or more embodiment Various information are stated, but these information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other It opens.For example, first can also be referred to as second, class in the case where not departing from this specification one or more scope of embodiments As, second can also be referred to as first.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... " or " in response to determination ".

Firstly, the vocabulary of terms being related to one or more embodiments of the invention explains.

FM: full name in English: Factorization Machines, Chinese name are as follows: Factorization machine is by Steffen A kind of machine learning algorithm based on matrix decomposition that Rendle is proposed, it can predict arbitrary real-valued vectors.It is led Wanting advantage includes: that 1) can be used for height sparse data scene；2) there is linear computation complexity.Main target is in the application How feature combines in the case where solving the problem of Sparse.

DNN: full name in English: Deep Neural Network, Chinese name are as follows: deep neural network, from DNN by difference The position of layer divides, and the neural net layer inside DNN can be divided into three classes, input layer, hidden layer and output layer.

DeepFM model: a kind of deep learning model is one and is integrated with the neural network framework of FM and DNN, in conjunction with The advantages of DNN and FM, the assemblage characteristic of low order and high-order can be extracted simultaneously.

CTR: full name in English: Click-Through-Rate, Chinese name: clicking rate.

CVR: full name in English: Conversion Rate, Chinese name: buying rate.

MTL: full name in English: Multi-Task Learning, Chinese name: multi-task learning is a kind of conclusion migration machine System, main target is to improve generalization ability using the specific area information in the training signal for lying in multiple inter-related tasks, Multi-task learning completes this target by using shared expression parallel training multiple tasks.

TL: full name in English: Transfer Learning, Chinese name: transfer learning is that (i.e. source is led a field Domain) knowledge, move to another field (i.e. target domain), target domain enabled to obtain better learning effect.

LR: full name in English: Logistics Regression, Chinese name: logistic regression.

GBDT: full name in English: Gradient Boosting Decision Tree, Chinese name: gradient promotes decision Tree.

In this application, the training method and device, a kind of recommended method and device, one kind of a kind of recommended models are provided Equipment and computer readable storage medium are calculated, is described in detail one by one in the following embodiments.

Fig. 1 is to show the structural block diagram of the calculating equipment 100 according to one embodiment of this specification.The calculating equipment 100 Component include but is not limited to memory 110 and processor 120.Processor 120 is connected with memory 110 by bus 130, Database 150 is for saving data.

Calculating equipment 100 further includes access device 140, access device 140 enable calculate equipment 100 via one or Multiple networks 160 communicate.The example of these networks includes public switched telephone network (PSTN), local area network (LAN), wide area network (WAN), the combination of the communication network of personal area network (PAN) or such as internet.Access device 140 may include wired or wireless One or more of any kind of network interface (for example, network interface card (NIC)), such as IEEE802.11 wireless local area Net (WLAN) wireless interface, worldwide interoperability for microwave accesses (Wi-MAX) interface, Ethernet interface, universal serial bus (USB) connect Mouth, cellular network interface, blue tooth interface, near-field communication (NFC) interface, etc..

In one embodiment of this specification, other unshowned portions in the above-mentioned component and Fig. 1 of equipment 100 are calculated Part can also be connected to each other, such as pass through bus.It should be appreciated that calculating device structure block diagram shown in FIG. 1 merely for the sake of Exemplary purpose, rather than the limitation to this specification range.Those skilled in the art can according to need, and increases or replaces it His component.

Calculating equipment 100 can be any kind of static or mobile computing device, including mobile computer or mobile meter Calculate equipment (for example, tablet computer, personal digital assistant, laptop computer, notebook computer, net book etc.), movement Phone (for example, smart phone), wearable calculating equipment (for example, smartwatch, intelligent glasses etc.) or other kinds of shifting Dynamic equipment, or the static calculating equipment of such as desktop computer or PC.Calculating equipment 100 can also be mobile or state type Server.

Wherein, processor 120 can execute the step in method shown in Fig. 2.Fig. 2 shows implemented according to this specification one A kind of schematic flow chart of the training method of recommended models of example, including step 202 is to step 206.

Step 202: obtaining the user characteristics of at least two sample of users and the attribute of at least two sample application programs Feature.

Such as the user characteristics of at least two sample of users and at least two sample applications in available preset duration The attributive character of program, the preset duration may include 60 days, 120 days etc., be configured according to actual needs, the application This is not limited in any way.

In practical application, sample of users and sample application program can carry user identifier and application program identification, Wherein, the unique identification information being identified as distinguishing each sample of users and each sample application program, for example, often A sample of users or the unique special string for playing mark action or special mark of the setting of each sample application program Deng.

The sample of users is the sample of users of recommended sample application program, and the sample application program includes but unlimited In office class, game class, the sample application program for entertaining class.

The user characteristics and the attributive character include offline feature and real-time characteristic, wherein the offline feature The history feature of the sample of users and the sample application program including acquisition, the real-time characteristic include the institute of acquisition State the feature of sample of users and the sample application program when event occurs.

Wherein, the offline feature of the user characteristics includes but is not limited to the basis portrait category feature of user, such as user Age, gender, constellation, occupation, level of education and division of life span etc.；The wealth category feature of user, such as income, the purchase of user Buy power, have the probability in room and have the probability of vehicle etc.；The position feature of user, such as birthplace, place of working, the family place of user Ground and permanent residence etc.；User behavior characteristics, for example, user to the impression of application program, hits and clicking rate etc. and other Feature, for example, user interest preference feature, search problem, that is, query feature, liveness feature, historical trading feature and in real time Red packet feature etc..The real-time characteristic of the user characteristics includes but is not limited to: the scene characteristic of user, such as user's channel source Deng can be according to the sample application program for jumping channel source and determining that sample of users likes of sample of users that is, in practical application.

The offline feature of the attributive character includes but is not limited to the primary attribute feature of application program, such as application program Classification, price, scoring, comment number, ranking, language etc., application program statistics category feature, such as application program nearly 1/3/7/ 15/30/90 day exposure pv (pageview, light exposure), exposure uv (unique visitor exposes number), pv (point is clicked The amount of hitting), click uv (click number), pv clicking rate, uv clicking rate (clicking number/exposure number) etc..The attributive character Real-time characteristic includes but is not limited to: scene characteristic, such as the current hour and week etc. of application program, i.e., can be in practical application According to the sample application program for determining that sample of users is liked in current time and week.

In practical application, the sample application program can be from ODPS (Open Data Processing Service) It obtains, the application program of original full dose is stored in ODPS.

Step 204: sample of users being generated based on user characteristics, attributive character, merger is clicked to the sample application program of exposure The negative sample that the positive sample and sample of users bought are clicked to the sample application program of exposure but do not buy or do not click on.

Wherein, the sample application program of the exposure is to show the sample application program of sample of users.

In order to estimate each sample of users to it is each exposure sample application program exposure conversion ratio, available 60 days Sample of users and for being clicked after sample of users exposure or the sample application program do not clicked, then according to sample of users The attributive character of user characteristics and exposure sample application program is parsed, and is formed after exposing after clicking positive negative sample and exposure and is turned Change positive negative sample (label), after then splicing upper user characteristics and attributive character, generating includes at least one positive sample and negative sample This sample set, wherein the positive and negative sample data of each in the sample set can be expressed as (features, label).

In practical application, the application of user identifier (user_id) and sample application program of sample of users can be combined with Program identification (item_id) clicks positive negative sample after generating formation exposure and converts positive negative sample (label) after exposing, if sample Application program exposure and sample of users, which are clicked, then clicks positive sample (y1=1) as exposure, if the exposure of sample application program is still Sample of users, which does not click on, then clicks negative sample (y1=0) as exposure, if sample of users is clicked simultaneously after the exposure of sample application program Purchase is then as exposure conversion positive sample (y2=1), if after sample of users is not clicked on or clicked after the exposure of sample application program not Purchase then as exposure conversion negative sample (y2=0), is represented by the form of (user_id, item_id, y1, y2).

Then according to user characteristics in user_id and item_id splicing and attributive character after, generate include at least one just The sample set of sample and negative sample, wherein the positive and negative sample data of each in the sample set can be expressed as (user_ id,item_id,features,y1,y2)。

Step 206: recommended models being trained based on the sample set for including at least one positive sample and negative sample, are obtained To the recommended models, the recommended models export click of each sample of users based on the sample application program to each exposure The exposure conversion ratio that rate and buying rate obtain.

In this specification one or more embodiment, based on the sample set pair including at least one positive sample and negative sample Before recommended models are trained, further includes:

Wherein, preset screening rule include but is not limited to from the sample set of generation choose preset duration in predetermined number Positive and negative sample data as test sample set, remaining is as training sample set.Such as default screening rule can wrap 200,000 positive and negative sample datas from the positive and negative sample data for extracting nearly one day in 1.4 hundred million sample sets of generation are included as survey Sample set is tried, remaining positive and negative sample data is as training sample set.In practical application, due to being exposed to for application program The ratio of conversion is the bottom of compared with, if the positive negative sample in test sample set is very little, the exposure conversion positive sample of application program is less, Assessment may be inaccurate, therefore it is more appropriate as test sample set to choose 1,500,000 positive negative samples.

Recommended models are trained based on the sample set for including at least one positive sample and negative sample, obtain described push away Model is recommended, the recommended models export clicking rate and purchase of each sample of users based on the sample application program to each exposure The exposure conversion ratio that the rate of buying obtains, i.e., based on the training sample set including at least one positive sample and negative sample to recommendation Model is trained, and obtains the recommended models, and the recommended models export each sample of users based on the sample to each exposure The exposure conversion ratio that the clicking rate and buying rate of application obtain.

In practical application, recommended models are trained based on the sample set for including at least one positive sample and negative sample Later, further includes:

Wherein, the recommended models are surveyed based on the test sample set for including at least one positive sample and negative sample Examination, is that the test sample set is input to the recommended models, so that each sample of users of the recommended models is based on The exposure conversion ratio that clicking rate and buying rate to the sample application program of each exposure obtain.Wherein, the recommended models Export the exposure conversion that clicking rate and buying rate of each sample of users based on the sample application program to each exposure obtain Rate includes:

The recommended models include but is not limited to DeepFM multi-task learning model, this specification one or more embodiment In, by taking the recommended models are DeepFM multi-task learning model as an example, it is introduced.

The DeepFM multi-task learning model is to apply multi-task learning MTL in the DeepFM model, In, the advantages of the DeepFM models coupling DNN and FM, the assemblage characteristic of low order and high-order can be extracted simultaneously.Wherein, FM Extracting section low order assemblage characteristic, comprising: the linear combination (weight and feature dot product) of single order feature, second order cross feature is (hidden Inner product of vectors).Deep extracting section higher order combination feature.Meanwhile FM and Deep shares input and embedding vector.Specifically Ground, the prediction result of DeepFM model such as formula (1) indicate:

Wherein, the output formula such as formula (2) of FM indicates:

The output formula such as formula (3) of DNN indicates:

YDN N=σ (W^|H|+1·a^H+b^|H|+1) (3)

As shown in figure 3, providing the network structure of the DeepFM model, DeepFM model is divided into Deep nerve first Network portion and FM Factorization machine part, the Deep part of neural network can be using the feedforward neural networks connected entirely The user characteristics of input and attributive character are divided into multiple feature groups by DNN, the DNN and FM, and each feature group corresponding one embedding Enter (embedding) vector, wherein the merging features layer (concat) of Deep part of neural network to all embedding to Amount is spliced, and two layers of full articulamentum (Fc (relu)) is further added by, and realizes the combination of high-order feature；FM Factorization machine is to defeated The input of the primitive characters such as the user characteristics entered and attributive character is weighted summation (addition), and by for every one-dimensional Embedding inner product of vectors combines to extract feature, realizes the combination of low order feature；Finally by Deep neural network and the FM factor The two-part output of disassembler is combined to obtain prediction result (sigmoid), i.e. clicking rate prediction result.

The multi-task learning MTL can make DeepFM model preferably general by sharing the characterization between inter-related task Include ancestral task.It is also a kind of conclusion migration mechanism, and main target is to utilize the training signal for lying in multiple inter-related tasks In specific area information improve generalization ability, multi-task learning come by using shared expression parallel training multiple tasks At this target.

In practical application, to the application program that user recommends, exposure, click can be undergone, arrive multiple steps such as final purchase. Target using DeepFM multi-task learning model is to improve final uv exposure conversion ratio (CTCVR=purchase number/exposure people Number).Wherein, the calculation formula for exposing conversion ratio (CTCVR) is shown in formula (4), by CTR (clicking rate is estimated) and CVR (buying rate Estimate) two parts are multiplied to obtain.

The recommendation that clicking rate is estimated is carried out according to positive and negative sample training clicking rate (CTR) recommended models clicked after exposure Method (is ranked up when recommendation according to pCTR), then does not utilize the conversion data after clicking, some applications adequately Program clicking rate is high, but the buying rate after clicking is very low, and it is not high to will lead to overall CTCVR.It in addition is exactly retraining one A CVR model is used to predict the buying rate after user's click, then obtains pCTCVR by pCTR*pCVR to be ranked up.But Directly training CVR model has only used the sample set that user has click, and positive and negative sample data volume is often than the sample set of exposure The positive and negative sample data for closing few one to two orders of magnitude, faces sample data sparsity problem.

Referring to fig. 4, DeepFM multi-task learning model is used in this specification embodiment.DeepFM multi-task learning model It is broadly divided into the part CTR and the part CVR, the low-level image feature features of input points are by the DeepFM multi-task learning model Multiple feature groups, corresponding insertion (embedding) vector of each feature group, the part CTR and CVR partial sharing bottom are special Levy features and embedding vector.From FIG. 4, it can be seen that CTR and CVR are respectively a DeepFM models, it is all made of Merging features layer (concat) splices all embedding vectors, is further added by two layers of full articulamentum (Fc (relu)), It realizes the combination of high-order feature and by extracting feature combination for every one-dimensional embedding inner product of vectors, realizes low The combination of rank feature obtains clicking rate prediction result and buying rate prediction result, wherein the buying rate is the purchase after exposure Rate finally will click on rate prediction result and be multiplied with buying rate prediction result, obtain exposure conversion ratio prediction result (i.e. pCTCVR).Shown in the loss function such as formula (5) of DeepFM multi-task learning model, wherein yi indicates whether to click, and zi is indicated Whether buy, optimizes exposure clicking rate and exposure buying rate simultaneously, and pCVR is DeepFM multi-task learning model One intermediate node.When finally recommendation, Bit-reversed is carried out according to pCTCVR, by answering near preceding N number of exposure User is recommended with program.

A kind of training method for recommended models that this specification one or more embodiment provides obtains at least two samples The attributive character of the user characteristics of user and at least two sample application programs；Sample is generated based on user characteristics, attributive character Sample application program of the positive sample and sample of users that this user clicks and buys to the sample application program of exposure to exposure The negative sample for clicking but not buying or not clicking on；Based on the sample set including at least one positive sample and negative sample to recommendation mould Type is trained, and obtains the recommended models, and the recommended models export each sample of users based on the sample to each exposure The exposure conversion ratio that the clicking rate and buying rate of application program obtain.To the DeepFM multi-task learning mould based on deep learning When type is trained, the data such as the application program of exposure bought or do not clicked on after sample of users is clicked are taken full advantage of, and Conversion characteristic after the automated characterization combined crosswise ability of DeepFM multi-task learning model and the application program click of exposure, It solves the problems, such as feature sparsity, is greatly improved the recommendation effect to the application program of exposure.

The step in method shown in Fig. 5 can also be performed in the processor 120.Fig. 5 shows real according to this specification one Apply a kind of schematic flow chart of recommended method of example, including step 502 is to step 510.

Step 502: receiving user to be recommended to the recommendation request of the application program of exposure, wherein the user to be recommended Carry user identifier.

Wherein, user to be recommended is the user of the application program of exposure to be recommended；The application program of exposure is to be demonstrated simultaneously Recommend the application program of user to be recommended.

The user identifier is the unique identification information for distinguishing each user to be recommended, for example, each use to be recommended The either special mark etc. of the unique special string for playing mark action of family setting.

In this specification one or more embodiment, receive the exposure of user to be recommended application program recommendation request it Before, further includes:

Obtain multiple tagged application programs of carrying；

Wherein, the application program to be recommended is to wait the application program for recommending user.The first preset condition packet It includes but is not limited to choose the application program in pre-set interval, such as pre-set interval is 1 to 200, first preset condition can be with To choose application program of the price in 1 yuan to 200 yuan section.First preset condition can also include rejecting porns, gambling and drugs phase The application program of pass or second-rate application program etc., wherein the second-rate application program can pass through application The scoring of program and comment number determine, such as the lower limit value of setting scoring and comment number, will scoring is less than 3 points and scoring number is small Second-rate application program is regarded as in 200 application programs.

First preset condition can be configured according to actual needs, and the application is not limited in any way this.By After first preset condition screens multiple application programs, the quantity of multiple application programs can be declined, and reduced It ensure that the recommendation quality of application program while follow-up work amount.

By taking first preset condition is to choose price in the application program in 1 yuan to 200 yuan section as an example, it is based on first Preset condition screens the multiple application program, determines at least two application programs to be recommended, that is, filters out price 1 Member is to the application program in 200 yuan of sections as application program to be recommended.

In practical application, all application programs are both placed in ODPS, can faster solve user's mass data Computational problem, realization multiple application programs to be recommended are quickly chosen from multiple application programs, can be effectively reduced enterprise at This, and ensure data safety.

Step 504: based on user identifier determination and the user matched at least two to be recommended application to be recommended Program.

Wherein, the application program to be recommended can be regarded as the initial application program for showing user, it is subsequent into When row recommended models are predicted, the exposure conversion ratio of each application program to be recommended can be obtained.

Specifically, the multiple application program is screened based on the first preset condition, determines that at least two is to be recommended After application program, further includes:

The preset matching rule includes but is not limited to that the hot topic of recommending each class in the top now is to be recommended using journey Sequence (being denoted as hot) recommends application program (being denoted as U2C2I) to be recommended to taobao purpose preference according to each user, according to property Not/age/city/purchasing power/interest tags are grouped user, recommend the application program to be recommended clicked with group user (being denoted as U2G2I) and/or application program to be recommended similar with the application program that user once clicked is recommended (to be denoted as Item- CF)。

By taking the preset matching rule is the popular application program for recommending each class in the top now as an example, based on default Matching rule matches user to be recommended with described at least two application programs to be recommended；I.e. based on the user identifier and The application program be identified as user's recommended games class to be recommended now in the top ten popular application program be it is to be recommended Application program makes the application program to be recommended of its user to be recommended and recommendation match.

After above-mentioned preset matching rule match, each user to be recommended is corresponding with dozens of or hundreds of to be recommended answers With program, the user identifier of each user to be recommended and the mark of application program to be recommended are then based on by each user to be recommended Matched application records to be recommended in the database, such as HBase database.It is answered in the exposure for receiving user to be recommended After the recommendation request of program, the user identifier based on the user to be recommended can inquire therewith in HBase database Matched application program to be recommended.

Step 506: extract the user to be recommended user characteristics and at least two application program to be recommended Attributive character.

The user characteristics and the attributive character include offline feature and real-time characteristic, wherein the offline feature The history feature of the user to be recommended and the application program to be recommended including acquisition, the real-time characteristic include acquisition The user to be recommended and the application program to be recommended current time feature.

The user characteristics and the attributive character extracted may refer to above-described embodiment, and this will not be repeated here.

Step 508: the user characteristics and the attributive character being input in recommended models trained in advance, obtain institute User to be recommended is stated to the exposure conversion ratio of each matched application program to be recommended.

Wherein, the recommended models include DeepFM multi-task learning model, the DeepFM multi-task learning model packet Include clicking rate estimate part and buying rate estimate part.

In practical application, the user characteristics of extraction and the offline feature of the attributive character can be synchronized to HBase number According in library, the recommended models predict the user to be recommended to the exposure conversion ratio of each matched application program to be recommended When, can directly be extracted from HBase database in real time user to be recommended user characteristics and each matched application to be recommended The offline feature of program is exposed conversion ratio prediction in conjunction with the real-time characteristic extracted in real time, and uses recommended models to institute It, can be special by the attribute when stating user to be recommended and predict the exposure conversion ratio of each matched application program to be recommended Sign carries out record and forms feature log and flow back into ODPS, is then based on the feature log off-line training recommended models again, Trained recommended models are updated, realize continuing to optimize for the recommended models.

Step 510: based on the exposure conversion ratio by least one of described at least two application programs to be recommended to Application program is recommended to recommend user to be recommended as the application program of exposure.

In this specification one or more embodiment, based on the application for exposing conversion ratio and being exposed described at least two The application program of at least one of program exposure recommends user to be recommended and includes:

In this specification one or more embodiment, the exposure conversion ratio is based on by described at least two applications to be recommended At least one of program application program to be recommended recommends user to be recommended as the application program of exposure

The sequence includes but is not limited to descending sort, and the default recommendation condition includes but is not limited to select 30 before ranking Application program to be recommended.

In practical application, described at least two application programs to be recommended are ranked up based on the exposure conversion ratio；

It can be with to carry out descending sorts to described at least two application programs to be recommended based on the exposure conversion ratio, so User to be recommended is recommended using before ranking 30 application program to be recommended as the application program of exposure afterwards, and 30 before the ranking Application program to be recommended as really exposes and recommends the application program of user.

In another implementation, described at least two application programs to be recommended are arranged based on the exposure conversion ratio After sequence, further includes:

Described at least two application programs to be recommended are screened based on second preset condition.

Wherein, second preset condition can include but is not limited to select the application program in default blacklist And it rejects.

In the case where being screened based on second preset condition to described at least two application programs to be recommended, base Select at least one of described at least two application programs to be recommended application program to be recommended as exposure in default recommendation condition The application program of light recommends user to be recommended

Descending sort is carried out to described at least two application programs to be recommended based on the exposure conversion ratio, it then will drop Described at least two application programs to be recommended after sequence sequence are matched with the application program in default blacklist, if there is matching On application program to be recommended, then by the application program to be recommended from the queue that descending arranges reject after, reselection exposure turn Rate highest first 30 or preceding 20 application programs to be recommended recommend user to be recommended as the application program of exposure, with reality User to be recommended now is recommended using optimal application program to be recommended as the application program of exposure, improves user experience.

In actual use, in the application program recommendation request for receiving the exposure of user to be recommended, obtains pass through first After screening with the matched application program to be recommended of user to be recommended, then by the attributive character of the application program to be recommended and The user characteristics of the user to be recommended matched are input in advance trained recommended models, obtain the user to be recommended to it is each to Recommend the exposure conversion ratio of application program.

Finally the corresponding application program to be recommended of the user to be recommended is ranked up based on exposure conversion ratio, then is based on The application program to be recommended that demand chooses before ranking 50 or preceding 60 is recommended and final as the application program of final exposure Show user to be recommended.

A kind of recommended method that this specification one or more embodiment provides, the application program screening that will acquire first I.e. selection filters, and filters out the application program to be recommended of high quality, then utilizes the strategies such as hot/U2C2I/U2G2I/Item-CF User is matched with application program to be recommended, avoids spending a large amount of manpowers to establish label system, and a variety of matching plans Slightly, the application program to be recommended that user may click can be more covered, and using the recommended models of deep learning according to wait push away It recommends user and matched application program to be recommended carries out the online conversion ratio prediction of exposure in real time, be for wait push away by exposure conversion ratio Recommending user recommends suitable application program to be recommended as the application program finally exposed, and real-time characteristic is effectively utilized, mentions Recommendation effect is risen.

In practical application, it is also necessary to recommended models trained in this specification are deployed to aol server and applied The online real-time marking of program.It is generally adopted by arks platform, The platform provides high-performance to estimate in line ordering and in real time Service, High Availabitity, and realize the various functions such as load balancing, long-distance disaster.It is first when there is the recommendation of user's request applications First needing to be retrieved previously according to user identifier user_id from HBase by retrieval module is that the user is matched in matching stage Hundreds of candidate's application program, that is, application programs to be recommended.And by the user and the offline and real-time spy of application program to be recommended It levies and is given a mark in real time to recommended models, obtain the user to an exposure conversion ratio of each application program to be recommended.Finally Descending sort is carried out further according to the exposure conversion ratio of application program to be recommended, will be come near 30 preceding application programs to be recommended Application program recommendation as final exposure shows user.In addition, to avoid recommending some unsuitable exposures to user Application program, blacklist strobe utility can also be set, the application program of the exposure of some badcase is promptly filtered out.

Referring to Fig. 6, this specification one or more embodiment provides a kind of training device of recommended models, comprising:

First obtains module 602, is configured as obtaining the user characteristics and at least two samples of at least two sample of users The attributive character of application；

Generation module 604 is configured as generating sample application of the sample of users to exposure based on user characteristics, attributive character Program, which is clicked, to be merged the positive sample bought and sample of users and clicks to the sample application program of exposure but do not buy or do not click on Negative sample；

Training module 606 is configured as based on the sample set including at least one positive sample and negative sample to recommendation mould Type is trained, and obtains the recommended models, and the recommended models export each sample of users based on the sample to each exposure The exposure conversion ratio that the clicking rate and buying rate of application program obtain.

Optionally, described device further include:

Optionally, the training module, is configured to:

Optionally, described device further include:

Optionally, the training module 606 includes:

The training device for a kind of recommended models that this specification one or more embodiment provides, to based on deep learning mould When the multi-task learning model of type DeepFM is trained, the exposure bought or do not clicked on after sample of users is clicked is taken full advantage of The data such as application program and DeepFM the automated characterization combined crosswise ability of multi-task learning model and answering for exposure Conversion characteristic after being clicked with program solves the problems, such as feature sparsity, is greatly improved pushing away to the application program of exposure Recommend effect.

A kind of exemplary scheme of the training device of above-mentioned recommended models for the present embodiment.It should be noted that this is pushed away The technical solution for recommending the technical solution of the training device of model and the training method of above-mentioned recommended models belongs to same design, pushes away The detail content that the technical solution of the training device of model is not described in detail is recommended, may refer to the training side of above-mentioned recommended models The description of the technical solution of method.

Referring to Fig. 7, this specification one or more embodiment additionally provides a kind of recommendation apparatus, comprising:

Receiving module 702 is configured as receiving user to be recommended to the recommendation request of the application program of exposure, wherein institute It states user to be recommended and carries user identifier；

Determining module 704 is configured as based on user identifier determination and the user to be recommended matched at least two A application program to be recommended；

Extraction module 706 is configured as extracting the user characteristics and described at least two of the user to be recommended wait push away Recommend the attributive character of application program；

Module 708 is obtained, is configured as the user characteristics and the attributive character being input to recommendation trained in advance In model, the user to be recommended is obtained to the exposure conversion ratio of each matched application program to be recommended；

First recommending module 710 is configured as to be recommended using journey by described at least two based on the exposure conversion ratio At least one of sequence application program to be recommended recommends user to be recommended as the application program of exposure.

Optionally, described device further include:

Optionally, first recommending module 710 includes:

Optionally, described device further include:

Second recommends submodule, is configured to:

A kind of recommendation apparatus that this specification one or more embodiment provides, the application program screening that will acquire first I.e. selection filters, and filters out the application program to be recommended of high quality, then utilizes the strategies such as hot/U2C2I/U2G2I/Item-CF User is matched with application program to be recommended, avoids spending a large amount of manpowers to establish label system, and a variety of matching plans Slightly, the application program to be recommended that user may click can be more covered, and using the recommended models of deep learning according to wait push away It recommends user and matched application program to be recommended carries out the online conversion ratio prediction of exposure in real time, be for wait push away by exposure conversion ratio Recommending user recommends suitable application program to be recommended as the application program finally exposed, and real-time characteristic is effectively utilized, mentions Recommendation effect is risen.

A kind of exemplary scheme of above-mentioned recommendation apparatus for the present embodiment.It should be noted that the skill of the recommendation apparatus The technical solution of art scheme and above-mentioned recommended method belongs to same design, and the technical solution of recommendation apparatus is not described in detail thin Content is saved, may refer to the description of the technical solution of above-mentioned recommended method.

Utilize " Darwin " laboratory AB test platform to provided in this specification matching rule, user characteristics and DeepFM multi-task learning model has carried out comparative experiments on line.Show that following comparison is as a result, main contrast's index: UV is clicked Rate (clicking number/exposure number) and UV exposure conversion ratio (purchase number/exposure number).

1, matching rule is tested

It is matched jointly using Hot and U2C2I matching rule than only being matched with hot, UV clicking rate promotes 9.93% (15.11%-- > 16.61%).

Tetra- matching rules of Hot, U2C2I, U2G2I and Item-CF match jointly to be matched compared to hot with U2C2I jointly, UV Clicking rate promotes 10.86% (28.18%-- > 31.25%).

Experiment adequately illustrates, is matched using multiple matching rules, can be obviously improved UV clicking rate.

2, characterization experiments

Increase interest preference feature, which is compared, does not have to interest preference feature, exposure conversion ratio promotion 1.93% (15.78%-- > 16.08%).Illustrate that the interest preference of the click buying behavior and user itself of user is closely bound up.

3, model experiment

The mode that is multiplied of Experimental comparison's CTR mod type and two individual CTR with CVR model scores first, CTR mod type It is all made of DeepFM with CVR model, and model structure and input feature vector are just the same.Experimental result data is shown in Table 1, from table 1 As can be seen that in such a way that two individual CTR with CVR model scores are multiplied, the drop of UV clicking rate, buying rate after UV is clicked It is promoted, still, final UV exposure conversion ratio is also poorer than individual CTR mod type.

Table 1

CTR mod type is subjected to comparative experiments on line with based on DeepFM multi-task learning model again, two model trainings Input feature vector, DeepFM model structure and sample data volume are just the same.Experimental result is shown in the following table 2, based on DeepFM's CTR mod type is as benchmark, it can be seen that poorer than CTR mod type in UV clicking rate based on DeepFM multi-task learning model Buying rate promotes 5.01% after 3.73%, UV are clicked, and improves 1.09% on total UV exposure conversion ratio.Experiment is sufficiently said It is bright, based on DeepFM multi-task learning model band come the promotion of UV exposure conversion ratio.

Table 2

One embodiment of the application also provides a kind of computer readable storage medium, is stored with computer instruction, the instruction The step of training method or the recommended method of recommended models as previously described are realized when being executed by processor.

A kind of exemplary scheme of above-mentioned computer readable storage medium for the present embodiment.It should be noted that this is deposited The technical solution of the training method or the recommended method of the technical solution of storage media and above-mentioned recommended models belongs to same structure Think, the detail content that the technical solution of storage medium is not described in detail, may refer to above-mentioned recommended models training method or The description of the technical solution of the recommended method.

It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can With or may be advantageous.

The computer instruction includes computer program code, the computer program code can for source code form, Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include: that can carry institute State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior Increase and decrease appropriate can be carried out according to the requirement made laws in jurisdiction with patent practice by holding, such as in certain jurisdictions of courts Area does not include electric carrier signal and telecommunication signal according to legislation and patent practice, computer-readable medium.

It should be noted that for the various method embodiments described above, describing for simplicity, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can use other sequences or carry out simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules might not all be this Shen It please be necessary.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.

The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application Principle and practical application, so that skilled artisan be enable to better understand and utilize the application.The application is only It is limited by claims and its full scope and equivalent.

Claims

1. a kind of training method of recommended models characterized by comprising

Based on user characteristics, attributive character generate sample of users positive sample that the sample application program of exposure is clicked and bought with And the negative sample that sample of users is clicked to the sample application program of exposure but do not buy or do not click on；

Recommended models are trained based on the sample set for including at least one positive sample and negative sample, obtain the recommendation mould Type, the recommended models export clicking rate and buying rate of each sample of users based on the sample application program to each exposure Obtained exposure conversion ratio.

2. the method according to claim 1, wherein based on the sample including at least one positive sample and negative sample Before set is trained recommended models, further includes:

Being screened the sample set based on default screening rule is the training sample for including at least one positive sample and negative sample Set and the test sample set including at least one positive sample and negative sample.

3. according to the method described in claim 2, it is characterized in that, based on the sample including at least one positive sample and negative sample Set, which is trained recommended models, includes:

4. according to the method described in claim 3, it is characterized in that, based on the sample including at least one positive sample and negative sample After set is trained recommended models, further includes:

5. method according to any of claims 1-4, which is characterized in that the recommended models include DeepFM more Be engaged in learning model, the DeepFM multi-task learning model include clicking rate estimate part and conversion ratio estimate part.

6. according to the method described in claim 5, it is characterized in that, the recommended models export each sample of users based on to every The exposure conversion ratio that the clicking rate and buying rate of the sample application program of a exposure obtain includes:

The clicking rate of the recommended models estimates part and exports each sample of users to the point of the sample application program of each exposure Hit rate；

The conversion ratio of the recommended models estimates part and exports purchase of each sample of users to the sample application program of each exposure Buy rate；

Exposure of each sample of users to the sample application program of each exposure is determined based on the clicking rate and the buying rate Conversion ratio.

7. the method according to claim 1, wherein the user characteristics and the attributive character include offline Feature and real-time characteristic, wherein the offline feature includes the sample of users and the sample application program of acquisition History feature, the sample of users and the sample application program that the real-time characteristic includes acquisition are when event occurs Feature.

8. a kind of recommended method characterized by comprising

User to be recommended is received to the recommendation request of the application program of exposure, wherein the user to be recommended carries user's mark Know；

The user characteristics and the attributive character are input in recommended models trained in advance, obtain the user to be recommended To the exposure conversion ratio of each matched application program to be recommended；

Based on the exposure conversion ratio by least one of described at least two application programs to be recommended application program to be recommended Application program as exposure recommends user to be recommended.

9. according to the method described in claim 8, it is characterized in that, the application program recommendation for receiving the exposure of user to be recommended is asked Before asking, further includes:

Obtain multiple tagged application programs of carrying；

The multiple application program is screened based on the first preset condition, determines at least two application programs to be recommended.

10. according to the method described in claim 9, it is characterized in that, also being wrapped after determining at least two application programs to be recommended It includes:

User to be recommended is matched with described at least two application programs to be recommended based on preset matching rule, wherein institute It states user to be recommended and carries user identifier.

11. according to method described in claim 8-10 any one, which is characterized in that the recommended models include DeepFM more Tasking learning model, the DeepFM multi-task learning model include clicking rate estimate part and buying rate estimate part.

12. according to the method described in claim 8, it is characterized in that, based on the exposure conversion ratio by described at least two to At least one of recommendation application program application program to be recommended recommends user to be recommended as the application program of exposure and includes:

At least one of at least two application program to be recommended application program to be recommended is selected based on default recommendation condition Application program as exposure recommends user to be recommended.

13. according to the method for claim 12, which is characterized in that based on the exposure conversion ratio to described at least two to After recommending application program to be ranked up, further includes:

At least one of at least two application program to be recommended application program to be recommended is selected based on default recommendation condition Application program as exposure recommends user to be recommended

At least one of described at least two application programs to be recommended after being screened based on the selection of default recommendation condition are to be recommended Application program recommends user to be recommended as the application program of exposure.

14. according to the method described in claim 8, it is characterized in that, the user characteristics and the attributive character include from Line feature and real-time characteristic, wherein the offline feature includes the user to be recommended acquired and the application to be recommended The history feature of program, the real-time characteristic include that the user to be recommended of acquisition and the application program to be recommended are being worked as The feature at preceding moment.

15. a kind of training device of recommended models characterized by comprising

First obtains module, is configured as obtaining the user characteristics and at least two sample application journeys of at least two sample of users The attributive character of sequence；

Generation module is configured as generating sample of users to the sample application program point of exposure based on user characteristics, attributive character It hits and negative sample that the positive sample bought and sample of users are clicked to the sample application program of exposure but do not buy or do not click on；

Training module is configured as instructing recommended models based on the sample set for including at least one positive sample and negative sample Practice, obtains the recommended models, the recommended models export each sample of users based on the sample application program to each exposure Clicking rate and the obtained exposure conversion ratio of buying rate.

16. device according to claim 15, which is characterized in that described device further include:

First screening module is configured as based on presetting screening rule to be by sample set screening including at least one positive sample Originally the test sample set with the training sample set of negative sample and including at least one positive sample and negative sample.

17. device according to claim 16, which is characterized in that the training module is configured to:

18. device according to claim 17, which is characterized in that described device further include:

Test module is configured as based on the test sample set including at least one positive sample and negative sample to the recommendation mould Type is tested.

19. device described in 5-18 any one according to claim 1, which is characterized in that the recommended models include DeepFM Multi-task learning model, the DeepFM multi-task learning model include clicking rate estimate part and buying rate estimate part.

20. device according to claim 19, which is characterized in that the training module includes:

First output sub-module, the clicking rate for being configured as the recommended models estimate part and export each sample of users to each The clicking rate of the sample application program of exposure；

Second output sub-module, the buying rate for being configured as the recommended models estimate part and export each sample of users to each The buying rate of the sample application program of exposure；

It determines submodule, is configured as determining each sample of users to each exposure based on the clicking rate and the buying rate The exposure conversion ratio of sample application program.

21. device according to claim 15, which is characterized in that the user characteristics and the attributive character include from Line feature and real-time characteristic, wherein the offline feature includes the sample of users and the sample application program of acquisition History feature, the real-time characteristic include acquisition the sample of users and the sample application program event occur when Feature.

22. a kind of recommendation apparatus characterized by comprising

Receiving module is configured as receiving user to be recommended to the recommendation request of the application program of exposure, wherein described to be recommended User carries user identifier；

Determining module is configured as to be recommended with the user matched at least two to be recommended based on user identifier determination Application program；

Extraction module is configured as extracting the user characteristics of the user to be recommended and described at least two to be recommended using journey The attributive character of sequence；

Module is obtained, is configured as the user characteristics and the attributive character being input in recommended models trained in advance, The user to be recommended is obtained to the exposure conversion ratio of each matched application program to be recommended；

First recommending module, be configured as based on the exposure conversion ratio by described at least two application programs to be recommended extremely A few application program to be recommended recommends user to be recommended as the application program of exposure.

23. device according to claim 22, which is characterized in that described device further include:

24. device according to claim 22, which is characterized in that described device further include:

Matching module is configured as being based on preset matching rule for user to be recommended and described at least two application programs to be recommended It is matched, wherein the user to be recommended carries user identifier.

25. according to device described in claim 22-24 any one, which is characterized in that the recommended models include DeepFM Multi-task learning model, the DeepFM multi-task learning model include clicking rate estimate part and buying rate estimate part.

26. device according to claim 22, which is characterized in that first recommending module includes:

Sorting sub-module is configured as arranging described at least two application programs to be recommended based on the exposure conversion ratio Sequence；

Second recommends submodule, is configured as selecting in described at least two application programs to be recommended based on default recommendation condition At least one application program to be recommended recommends user to be recommended as the application program of exposure.

27. according to the method for claim 26, which is characterized in that described device further include:

Third screening module is configured as carrying out described at least two application programs to be recommended based on second preset condition Screening；

Second recommends submodule, is configured to:

28. device according to claim 22, which is characterized in that the user characteristics and the attributive character include from Line feature and real-time characteristic, wherein the offline feature includes the user to be recommended acquired and the application to be recommended The history feature of program, the real-time characteristic include that the user to be recommended of acquisition and the application program to be recommended are being worked as The feature at preceding moment.

29. a kind of calculating equipment including memory, processor and stores the calculating that can be run on a memory and on a processor Machine instruction, which is characterized in that the processor realizes claim 1-14 any one the method when executing described instruction Step.

30. a kind of computer readable storage medium, is stored with computer instruction, which is characterized in that the instruction is held by processor The step of claim 1-14 any one the method is realized when row.