The method that a kind of couple of user carries out Topics Crawling and application is recommended
Technical field
The present invention relates to the field of data mining, carry out Topics Crawling more particularly, to a kind of couple of user and application is recommended
Method.
Background technique
On a mobile platform, the theme for excavating user is often carried out according to some apparent labels of user, such as
Age, active degree, geographical location etc..In fact, the installation list of application of user is also able to reflect some subject matter preferences of user.
But since the size distribution of the amount of user installation application is extremely uneven, and the user in country variant or area answers same
Preference difference is larger, thus this partial information is more difficult is used appropriately.Using traditional Topics Crawling algorithm, such as
LDA algorithm, the theme excavated can not reflect that the theme distribution of user and geographical location influence.
This theme distribution that the installation list of application and geographical location information of user is included can not only be used to carry out
Customer analysis may be also used in and carry out user using in the work recommended.Traditional proposed algorithm, can be according to the spy of user
The union feature of sign, the feature of application and user and application carrys out training pattern, to complete to carry out user using recommendation
Task.But the installation list of application of user can not directly be brought as feature because each user installation application amount difference compared with
Greatly.It, can be by if the potential feature of user and application can be excavated according to the installation list of application of numerous users
This calculates a similarity between user and application, and the important feature of an application whether is liked as judge user.
Summary of the invention
The present invention provides the method that a kind of couple of user carries out Topics Crawling and application is recommended, and this method can be according to the peace of user
List of application is filled to excavate the potential feature of user and application, and recommends the application of its intention to user.
In order to reach above-mentioned technical effect, technical scheme is as follows:
The method that a kind of couple of user carries out Topics Crawling and application is recommended, comprising the following steps:
S1: the weight that user's theme is contributed in application is calculated;
S2: the probability graph model of user's selection course when installing application is established;
S3: seeking probability graph model parameter, and completes the recommendation of excavation and the application of theme;
Further, the weighted value that user's theme is contributed in application is calculated in the step S1:
Wherein, user indicates that a user, app indicate the application that user is installed, LuserIndicate user user institute
Number, L are applied in installationaverageWhat the average user of expression was installed applies number, | U | indicate the total amount of user in data set,
nappIndicate the sum for being mounted with the user of the app.By rounding up, weight is an integer greater than 0.
Further, the process of the step S2 is as follows:
S21: according to the preference distribution of userGenerate the preferences variable x of useru,n, whereinI.e.
Preferences variable xu,nObey withFor a bi-distribution of parameter;
S22: if x obtained in S21u,nValue be 0, then it represents that when selecting application, consider is personalized preference to user,
Distribution first according to user to themeUser sample out for the theme z of application to be installedu,n, whereinI.e. the application theme obey withFor the multinomial distribution of parameter;
S23: according to distribution of the theme z in each applicationCorresponding application is generated, whereinI.e.
Institute's application to be installed is by a parameterMultinomial distribution sample generate;
S24: if x obtained in S21u,nValue be 1, then it represents that in selection, consider at once is locating geographical position to user
Factor is set, geographical location locating for user u is l,Indicate the position to the preference distribution of application, whereinI.e. institute's application to be installed is by a parameterMultinomial distribution sample generate;
Wherein, xu,nIndicate selection preference of u-th of user in n-th of installation application, xu,n∈ { 0,1 }, works as xu,nValue
When being 0, indicates that user u is to select to install what this was applied according to the hobby feature of oneself, work as xu,nValue be 1 when, indicate user
It is according to geographical location locating for the user u to select that the application is installed;Indicate the preference distribution of user u selection application,
zu,nIndicate the theme of n-th of application of u-th of user installation,Indicate user u to the preference distribution of theme,Indicate theme z
A distribution in each application,Indicate preference distribution of the country l to application, appuIndicate what user u was installed
Some application;It indicatesPrior distribution parameter,It indicatesPrior distribution parameter,It indicatesPrior distribution
Parameter,It indicatesPrior distribution parameter;For convenient for solving model parameter, prior distribution takes corresponding conjugation here
Distribution, the conjugation of bi-distribution are distributed as beta (Beta) distribution, and the conjugation of multinomial distribution is distributed as Di Li Cray
(Dirichlet) it is distributed;
Herein, when the weight using app of user user installation is weight, it is believed that user in total carries out app
The process of weight selection installation.
Further, the process for the parameter for seeking model in the step S3 is as follows:
It is iterated by formula of sampling as follows:
WhereinIndicate exclude current application when time select after, xu,nValue be 0 application selection install number;AndIndicate exclude current application when time select after, xu,nValue be x' application selection install number;Correspondingly,Table
Show exclude current application after time selection, the selection installation number of application that theme is z;It indicates to exclude current application
When time selection after, using app remaining weight-1 selection install in, xu,nValue be 0 number;It is other that the rest may be inferred;
After completing Λ iteration, according to the following various parameter for acquiring model:
Each theme is applying upper distributionThe theme exactly excavated utilizes following formula:
P (app | u, l)=p (x=0 | u) ∑zP (z | u) p (app | z)+p (x=1 | u) p (app | l) calculate user couple
Degree value is liked in a certain application, and the highest several applications of degree value are liked in right rear line recommendation.
Preferably, the number of iterations Λ is not small by 300.
Compared with prior art, the beneficial effect of technical solution of the present invention is:
The present invention, which first quantitatively calculates, measures the weight that user's theme is contributed in some application that user is installed, in conjunction with power
Establish the model of process that user selects application to be installed again, the parameter for then calculating established model again can mould
Quasi- user selects the process of application and recommends the application software of its intention to user, realizes the installation list of application according to user
The function of the potential feature of user and application is excavated, and recommends user the function of more interested application.
Detailed description of the invention
Fig. 1 is the flow chart that model parameter is sought in the present invention;
Fig. 2 is the explanatory diagram of probability graph model.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
In order to better illustrate this embodiment, certain components have omission, zoom in or out in attached drawing, do not represent practical production
The size of product;
To those skilled in the art, it is to be understood that certain known features and its explanation, which may be omitted, in attached drawing
's.
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.
Embodiment 1
The method that a kind of couple of user carries out Topics Crawling and application is recommended, comprising the following steps:
S1: the weight that user's theme is contributed in application is calculated;
S2: the probability graph model of user's selection course when installing application is established;
S3: seeking the parameter of probability graph model, and completes the recommendation of theme excavated and applied.
Further, the weighted value that user's theme is contributed in application is calculated in the step S1:
Wherein, user indicates that a user, app indicate the application that user is installed, LuserIndicate user user institute
Number, L are applied in installationaverageWhat the average user of expression was installed applies number, | U | indicate the total amount of user in data set,
nappIndicate the sum for being mounted with the user of the app.By rounding up, weight is an integer greater than 0.
As shown in Fig. 2, the process of step S2 is as follows:
S21: according to the preference distribution of userGenerate the preferences variable x of useru,n, whereinI.e.
Preferences variable xu,nObey withFor a bi-distribution of parameter;
S22: if x obtained in S21u,nValue is 0, then it represents that user is personalized preference select to consider when application, first
The first distribution according to user to themeUser sample out for the theme z of application to be installedu,n, wherein
I.e. the application theme obey withFor the multinomial distribution of parameter;
S23: according to distribution of the theme z in each applicationCorresponding application is generated, whereinI.e.
Institute's application to be installed is by a parameterMultinomial distribution sample generate;
S24: if x obtained in S21u,nValue is 1, then it represents that user is locating geographical position select to consider when application
Factor is set, geographical location locating for user u is l,Indicate the position to the preference distribution of application, whereinI.e. institute's application to be installed is by a parameterMultinomial distribution sample generate;
Wherein, xu,nIndicate selection preference of u-th of user in n-th of installation application, xu,n∈ { 0,1 }, works as xu,nValue
When being 0, indicates that user u is to select to install what this was applied according to the hobby feature of oneself, work as xu,nValue be 1 when indicate user
It is according to geographical location locating for the user u to select that the application is installed;Indicate the preference distribution of user u selection application,
zu,nIndicate the theme of n-th of application of u-th of user installation,Indicate user u to the preference distribution of theme,Indicate theme z
A distribution in each application,Indicate preference distribution of the country l to application, appuIndicate what user u was installed
Some application;It indicatesPrior distribution parameter,It indicatesPrior distribution parameter,It indicatesPrior distribution
Parameter,It indicatesPrior distribution parameter;For convenient for solving parameter, prior distribution takes corresponding conjugation point here
Cloth, the conjugation of bi-distribution are distributed as beta (Beta) distribution, and the conjugation of multinomial distribution is distributed as Di Li Cray (Dirichlet)
Distribution.Study firstWithThe result of model parameter is influenced and little, for convenient for calculating, if its per it is one-dimensional
It is identical, and its value takes a value less than 1;In Fig. 2, L indicates the number in geographical location, and K indicates the number of theme, and U is indicated
The number of user, N indicate the number of the installed application of user.
Herein, according to patent requirements 2, when the weight using app of user user installation is weight, it is believed that user
The process of weight selection installation has been carried out to app in total.
It is as follows that model parameter process is sought in step S3:
It is iterated by formula of sampling as follows:
WhereinIndicate exclude current application when time select after, xU, nValue be 0 application selection install number;AndIndicate exclude current application when time select after, xu,nValue be x' application selection install number;Correspondingly,Table
Show exclude current application after time selection, the selection installation number of application that theme is z;It indicates to exclude current application
When time selection after, using app remaining weight-1 selection install in, xu,nValue be 0 number;It is other that the rest may be inferred;
After completing Λ iteration, according to the following various parameter for acquiring model:
Each theme is applying upper distributionThe theme exactly excavated utilizes following formula:
P (app | u, l)=p (x=0 | u) ΣzP (z | u) p (app | z)+p (x=1 | u) p (app | l) calculate user couple
Degree value is liked in a certain application, and the highest several applications of degree value are liked in right rear line recommendation.
This method, which first quantitatively calculates, measures the weight that user's theme is contributed in some application that user is installed, in conjunction with power
Establish the model of process that user selects application to be installed again, the parameter for then calculating established model again can mould
Quasi- user selects the process of application and recommends the application software of its intention to user, realizes the installation list of application according to user
The function of the potential feature of user and application is excavated, and recommends user the function of more interested application.
The same or similar label correspond to the same or similar components;
Described in attached drawing positional relationship for only for illustration, should not be understood as the limitation to this patent;
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair
The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description
To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this
Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention
Protection scope within.