Summary of the invention
The disclosure is cold-started problem, a solution of proposition primarily directed to user.
According to the first aspect of the disclosure, the relation excavation method between a kind of article of different field is provided, comprising:
Obtain the second behavior that user is directed to the second field article for the first behavioural information of the first field article and the user
Information;First behavioural information and second behavioural information based on multiple users, determine first field article with
The degree of correlation between the article of second field.
Preferably, the step of determining the degree of correlation between first field article and second field article can wrap
Include: the first behavioural information based on the multiple user determines of first field article relative to the multiple user
The distribution of one behavioural characteristic;The second behavioural information based on the multiple user determines second field article relative to described
The second behavioural characteristic of multiple users is distributed;According to the phase of first behavioural characteristic distribution and second behavioural characteristic distribution
Like degree, the degree of correlation between first field article and second field article is determined.
Preferably, the first behavioural characteristic distribution and/or second behavioural characteristic distribution may include with the next item down
Or multinomial: whether user performs behavior to article;Behavior number of the user to article;Preference of the user to article.
Preferably, the first behavioural information and/or the second behavioural information include: whether user performs behavior to article;With/
Or the behavioral data that the behavior that article executes is generated based on user.
Preferably, behavioral data may include following one or more: behavior type;Behavior number;Behavior duration.
Preferably, the distribution of the first behavioural characteristic is including each user in the multiple user to first field article
First preference, the second behavioural characteristic distribution is including each user in the multiple user to second field article
The step of second preference, the degree of correlation between determination the first field article and second field article includes: to establish
The multiple user is directed to second of the second field article described in the first preference vector sum of first field article respectively
Preference vector;By calculating the similarity between the second preference vector described in the first preference vector sum, institute is determined
State the degree of correlation between the first field article and second field article.
Preferably, user is equal to user for each behavior in at least partly behavior type of article to the preference of article
The summation of the corresponding sub- preference of type, wherein the sub- preference is positively correlated with behavior number and behavior weight respectively.
Preferably, following formula can be used and determine that user is directed to the preference r of article,
Wherein, T is the behavior type set that user is directed to article, and t is behavior type, qtFor the behavior under behavior type t
Number, WtFor the corresponding behavior weight of behavior type t.
Preferably, which can also include: respectively to first preference vector sum the second preference vector
It is normalized.
According to the second aspect of the disclosure, the relation excavation method between a kind of article of different field is additionally provided, is wrapped
It includes: for each user in multiple users, obtaining the user is directed to one or more first field articles first respectively
Behavioral data and the user are directed to the second behavioral data of one or more second field articles;Based on the multiple use
First behavioral data at family and second behavioral data determine at least partly each of first field article and at least portion
Divide the degree of correlation between each of second field article.
In terms of according to the third of the disclosure, a kind of item recommendation method is additionally provided, comprising: obtain user in the first neck
The first behavioral data in domain, first behavioral data are related to one or more first field articles;Based on one or
At least one of multiple first fields article respectively with the degree of correlation between each of at least one the second field article,
The second field article is chosen from least one described second field article;And recommend the second selected neck to the user
Domain article.
Preferably, the degree of correlation between the first field article and the second field article, which can be, utilizes first side of the disclosure
What the relation excavation method that face or the second aspect are addressed obtained.
Preferably, the step of choosing the second field article from least one described second field article may include: meter
Calculate the recommendation of each second field article;According to the sequence that recommendation is descending, predetermined quantity in the top is chosen
Second field article.
Preferably, the recommendation of second field article respectively with it is described at least one each of the first field object
The degree of correlation of product and second field article is positively correlated.
Preferably, the recommendation of second field article is equal to second field article to described at least one
The summation of the sub- recommendation of each the first field article, the sub- recommendation respectively with first field article and described the
The degree of correlation of two field articles and the user are positively correlated the preference of first field article.
Preferably, the recommendation that following formula calculates the second field article can be used,
Wherein, recujUser u is indicated to the recommendation of the second field article j, I is the first row of the user in the first field
For the set of the first field article involved in data, i is the first field article, sim (i, j) indicate the first field article i and
The degree of correlation between second field article j, ruiIndicate user u to the preference of the first field article i.
Preferably, user is directed to first field article extremely equal to user to the preference of first field article
The summation of the corresponding sub- preference of each behavior type in small part behavior type, wherein the sub- preference respectively with behavior
Number and behavior weight are positively correlated.
Preferably, the first behavior data packet includes the following one or more letters for the behavior that user executes the first field article
Breath: behavior type;Behavior number;Behavior duration.
According to the 4th of the disclosure the aspect, the relation excavation device between a kind of article of different field is additionally provided, is wrapped
Include: behavioural information obtains module, for obtaining first behavioural information and the user needle of the user for the first field article
To the second behavioural information of the second field article;Degree of correlation determining module, for first behavior letter based on multiple users
Breath and second behavioural information, determine the degree of correlation between first field article and second field article.
Preferably, the degree of correlation determining module may include: the first behavioural characteristic distribution determination unit, for being based on institute
The first behavioural information for stating multiple users determines first behavioural characteristic of first field article relative to the multiple user
Distribution;Second behavioural characteristic is distributed determination unit, for based on the second behavioural information in the multiple user, determining described the
Two field articles are distributed relative to the second behavioural characteristic of the multiple user;Degree of correlation determination unit, for according to described the
The similarity degree of the distribution of one behavioural characteristic and second behavioural characteristic distribution, determines first field article and described second
The degree of correlation between the article of field.
Preferably, the distribution of the first behavioural characteristic and/or second behavioural characteristic distribution may include with the next item down or more
: whether user performs behavior to article;Behavior number of the user to article;Preference of the user to article.
Preferably, the first behavioural information and/or the second behavioural information include: whether user performs behavior to article;With/
Or the behavioral data that the behavior that article executes is generated based on user.
Preferably, behavioral data may include following one or more: behavior type;Behavior number;Behavior duration.
Preferably, the distribution of the first behavioural characteristic is including each user in the multiple user to first field article
First preference, the second behavioural characteristic distribution is including each user in the multiple user to second field article
Second preference, the degree of correlation determination unit include: that vector establishes unit, are directed to institute respectively for establishing the multiple user
State the second preference vector of the second field article described in the first preference vector sum of the first field article;And degree of correlation meter
Unit is calculated, for determining the first field object by calculating the similarity between first preference vector sum the second preference vector
The degree of correlation between product and the second field article.
Preferably, user is equal to user for each behavior in at least partly behavior type of article to the preference of article
The summation of the corresponding sub- preference of type, wherein the sub- preference is positively correlated with behavior number and behavior weight respectively.
Preferably, the first behavioural characteristic distribution determination unit and/or second behavioural characteristic distribution determination unit can be with
Determine that user is directed to the preference r of article using following formula,
Wherein, T is the behavior type set that user is directed to article, and t is behavior type, qtFor the behavior under behavior type t
Number, WtFor the corresponding behavior weight of behavior type t.
Preferably, which can also include: normalized module, for respectively to the first preference to
Amount and the second preference vector are normalized.
According to the 5th of the disclosure the aspect, the relation excavation device between a kind of article of different field is additionally provided, is wrapped
Include: behavioral data obtains module, for obtaining the user respectively for one or more for each user in multiple users
First behavioral data of a first field article and the user are directed to the second behavior of one or more second field articles
Data;Degree of correlation determining module is determined for the first behavioral data and second behavioral data based on the multiple user
The degree of correlation at least partly between each of first field article and at least partly each of second field article.
According to the 6th of the disclosure the aspect, a kind of article recommendation apparatus is additionally provided, comprising: the first behavior data acquisition
Module, for obtaining first behavioral data of the user in the first field, first behavioral data is related to one or more the
One field article;Article chooses module, for based at least one of one or more of first fields article respectively with
The degree of correlation between each of at least one second field article chooses from least one described second field article
Two field articles;And article recommending module, for recommending the second selected field article to the user.
Preferably, the degree of correlation between the first field article and the second field article, which can be, utilizes first side of the disclosure
What the relation excavation method that face or the second aspect are addressed obtained.
Preferably, it may include: recommendation computing unit that article, which chooses module, for calculating each second field article
Recommendation;Article selection unit chooses the second of predetermined quantity in the top for the sequence descending according to recommendation
Field article.
Preferably, the recommendation of second field article respectively with it is described at least one each of the first field object
The degree of correlation of product and second field article is positively correlated.
Preferably, the recommendation of each second field article be equal to second field article to it is described at least one
Each of the sub- recommendation of the first field article summation, the sub- recommendation respectively with first field article and institute
The degree of correlation and the user for stating the second field article are positively correlated the preference of first field article.
Preferably, the recommendation that following formula calculates the second field article can be used in recommendation computing unit,
Wherein, recujUser u is indicated to the recommendation of the second field article j, I is the first row of the user in the first field
For the set of the first field article involved in data, i is the first field article, sim (i, j) indicate the first field article i and
The degree of correlation between second field article j, ruiIndicate user u to the preference of the first field article i.
According to the 7th of the disclosure the aspect, a kind of calculating equipment is additionally provided, comprising: processor;And memory,
On be stored with executable code, when executable code is executed by processor, make processor execute the disclosure first aspect
Or the method that the second aspect is addressed.
According to the 8th of the disclosure the aspect, a kind of non-transitory machinable medium is additionally provided, is stored thereon
There is executable code, when executable code is executed by the processor of electronic equipment, makes first of the processor execution disclosure
The method that aspect or the second aspect are addressed.
The correlation between cross-cutting article can be determined using the relation excavation scheme between the article of the different field of the disclosure
Degree.It, can be based on the degree of correlation between cross-cutting article, by user at other when user does not have behavioral data in target domain
The article that known behavioral data is related in field is mapped in target domain on article similar with its, so as to solve to use
Family is cold-started problem, promotes user experience.
Specific embodiment
The preferred embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Preferred embodiment, however, it is to be appreciated that may be realized in various forms the disclosure without the embodiment party that should be illustrated here
Formula is limited.On the contrary, these embodiments are provided so that this disclosure will be more thorough and complete, and can be by the disclosure
Range is completely communicated to those skilled in the art.
[general introduction]
The disclosure is cold-started problem, a kind of solution party of proposition primarily directed to user involved in information recommendation process
Case.The core concept of the disclosure is, using user as association tie, by collecting multiple users in different field for difference
The behavioral data of field article pre-establishes the incidence relation (degree of correlation hereafter addressed) between different field article.
It, can be with base when recommending the article in target domain for it for there is no the user of behavioral data in target domain
Incidence relation and user between the different field article pre-established are directed to the row of other field article in other field
For data, the higher target of relevance for the other field article that selection and user browse in other field out of target domain
Field article recommends user.Thus, it is possible to solve the problems, such as that user is cold-started.
The article that the disclosure is addressed is primarily referred to as the article shown in internet, can be news, picture, video,
The virtual objects such as music, are also possible to physical item.For example, the object by taking the field being related to is shopping platform as an example, in the field
Product can be the virtual objects or physical objects to be vended shown on shopping platform, such as can be game money, game item
The virtual objects such as gift bag are also possible to the physical items such as clothes, the digital product of businessman's sale.
It the field that the disclosure is addressed then can be there are many division mode.
As an example of the disclosure, different applications can be considered as different fields, it can also be by same application
In disparate modules be considered as different fields.It, can be by the difference in application by taking Domestic News application (such as today's tops) as an example
Channel is considered as different fields, such as can by today's tops video channel, social channel, entertainment channel, channel for finance and economics, when
Still channel is considered as different fields.By taking shopping platform (such as Jingdone district store) as an example, the computer complete machine under store, office can be consumed
The commodity classifications such as material, electrical equipment, mobile phone digital, supermarket's general merchandise are considered as different fields.
As another example of the disclosure, field that can also be different according to the Attribute transposition of article, such as can root
The multiple fields such as music, video, picture, novel are divided into according to the format of article.And it is directed to ready-portioned field, it can also be right
It does further division.For example, be directed to video field, can also according to video carry label, be further divided into romantic play,
The multiple fields such as anti-Japanese play, costume piece.
As another example of the disclosure, different types of application can also be considered as different field.It such as can be by society
It hands over communication class application (such as wechat, QQ) and news to read class to be considered as not using (such as today's tops, intension terminal, phoenix news)
Same domain.
In addition it can there is other multiple fields division modes, details are not described herein again.
Various aspects involved in the technical solution to the disclosure are described respectively below.
[foundation of relationship between cross-cutting article]
Fig. 1 is the schematic of the method for digging of relationship between showing the article according to the different field of one embodiment of the disclosure
Flow chart.
First behavioural information and the user of the user for the first field article are obtained in step S110 referring to Fig. 1
For the second behavioural information of the second field article.
First behavioural information and/or the second behavioural information may include indicating whether user performs the letter of behavior to article
Breath, and in the case where user performs behavior to article, the first behavioural information and/or the second behavioural information can also include
The behavioral data that the behavior that article executes is generated based on user.
For example, in the case where behavioural information (the first behavioural information and/or the second behavioural information) is empty information, it can be with table
Show that user is not carried out behavior to article, in the case that behavioural information is not sky, can indicate that user's degree article performs behavior.
For another example behavioural information can also include marking for indicating whether user performs the information of behavior, such as identification information to article
Knowing information can be " 1 " and " 0 ", and " 1 " indicates that user performs behavior to article, and " 0 " indicates that user is not carried out row to article
For.
Behavioral data may include the data that user generates article process performing, also may include the row executed to user
For the data counted.It is directed to the click of article execution for example, behavioral data can include but is not limited to user, broadcasts
A variety of behavior types, the behavior number of every kind of behavior type and the behavior duration etc. such as put, evaluate.
In step S120, the first behavioural information and the second behavioural information based on multiple users determine the first field article
With the degree of correlation between the second field article.
The disclosure can be association tie with user, by analyze in multiple users the first behavioural information of each user and
Second behavioural information determines first behavioural characteristic distribution and second field of the first field article relative to this multiple user respectively
Article is distributed relative to the second behavioural characteristic of this multiple user.
The distribution of first behavioural characteristic can characterize in multiple users respectively to the behavioural characteristic of the first field article, the second row
Multiple users can be characterized respectively to the behavioural characteristic of the second field article by being characterized distribution.As an example, behavioural characteristic can be with
It is whether user performs behavior to article, is also possible to user to the behavior number of article, can also be and be obtained by calculation
User to the preference etc. of article.Wherein, user can be believed whether article performs behavior, behavior number with subordinate act
It is obtained in breath, the calculation of preference is described below, and wouldn't repeat herein.
It, can be according to the distribution of the first behavioural characteristic of A and B for specific first field article A and the second field article B
The second behavioural characteristic distribution between similarity degree, determine the degree of correlation of A and B.It determines that principle is, if multiple users couple
The first behavioural characteristic distribution of A is similar with the second behavioural characteristic distribution to B, it may be considered that A and B strong correlation, otherwise weak phase
It closes.
|
Video 1 |
Video 2 |
Video 3 |
Video 4 |
Music 1 |
Music 2 |
Music 3 |
User 1 |
1 |
0 |
0 |
1 |
1 |
1 |
0 |
User 2 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
User 3 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
User 4 |
0 |
1 |
0 |
1 |
1 |
0 |
1 |
User 5 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
Table one
By taking table one as an example, the first field can be video, and the second field can be music, and the first field article is specific
Video, such as video 1- video 4, the second field article is specific music, such as music 1- music 3.User is in phase shown in table
It answers the number 1 under article to indicate that user performs behavior to corresponding article (video or music), there is behavioral data (such as point
Hit, play, collect), number 0 of the user shown in table under respective articles indicate user to corresponding article (video or
Music) it is not carried out behavior, without behavioral data.
As shown in Table 1, user 4 performs behavior for video 2, and user 1, user 3 and user 5 execute music 2
Behavior.Therefore, the first behavioural characteristic distribution of video 2 can be expressed as { 0,0,0,1,0 }, the second behavioural characteristic of music 2
Distribution can be expressed as { 1,0,1,0,1 }.Due to video 2 and music 2 it is uncommon use user, it can be considered that video
2 is weak related to music 2, i.e., the degree of correlation between video 2 and music 2 may be considered zero.
[calculating of the degree of correlation]
As an example of the disclosure, the distribution of the first behavioural characteristic can characterize this multiple user respectively to the first field
The distribution situation of the preference of article.The distribution of second behavioural characteristic can characterize this multiple user respectively to the second field article
The distribution situation of preference.That is, the first behavioural characteristic distribution may include in this multiple user each user to the first field object
First preference of product, the second behavioural characteristic distribution may include in multiple users each user to the second of the second field article
Preference.
It is so directed to specific first field article A and the second field article B, can be distinguished by calculating this multiple user
Distribution situation to the preference of A and to the similarity between the distribution situation of the preference of B, determines the correlation between A and B
Degree.The calculating process of the degree of correlation is as follows.
Step 1, preference calculate
As described above, in the case where user performs behavior to article, behavioural information can also include being based on user
The behavioral data that generates to the behavior that article executes, for the ease of distinguishing, the behavioral data that the first behavior information includes can be with
Referred to as " the first behavioral data ", the behavioral data that the second behavioural information includes are properly termed as " the second behavioral data ".Wherein, with
In the case that family does not perform behavior to article, behavioural information does not include behavioral data, and the behavioral data for including in other words is
Null value.
Each user couple can be calculated according to the first behavioral data and the second behavioral data of user each in multiple users
First preference of the first field article in the first field, and the second preference to the second field article in the second field
Degree.
As described above, user may include a variety of behavior classes such as clicking, playing, evaluate for the behavioral data of article
Type, and the execution number of different behavior types is also not quite similar.It is therefore contemplated that total preference of the user to article
Equal to user for the summation of the corresponding sub- preference of behavior type each in at least partly behavior type of article.Wherein, sub
Preference can be positively correlated with behavior number and behavior weight respectively.
Following formula can be used for example and calculate the preference r that user is directed to article,
Wherein, T is at least partly (preferably whole) behavior type of user for article, and t is different behavior class
Type, qtFor the behavior number under behavior type t, WtFor the corresponding behavior weight of behavior type t.Behavior type, behavior number can be with
It is obtained in subordinate act data, the corresponding weight of different behavior types can be predefined by way of assignment, can also be passed through
Other way determines, such as can determine its weight according to the behavior duration of behavior type.
The foundation of step 2, vector
It can establish first the second field of preference vector sum object that the multiple user is directed to the first field article respectively
Second preference vector of product.The number of element in preference vector and the number of user are consistent, and the value of element is user's needle
To the preference of respective articles.
Similarity between step 3, calculating vector
A variety of calculations can be taken to calculate the similarity between first preference vector sum the second preference vector.Meter
Obtained similarity can be used as the degree of correlation between corresponding first field article and the second field article.
Such as cosine similarity, Jaccard similarity, Pearson correlation coefficient, Euclidean distance, Man Ha can be passed through
A variety of vector similarity calculations such as distance, mahalanobis distance of pausing are calculated.
It should be noted that the dimension of the preference for the article being calculated due to different field may be different, in order to keep away
Exempt from dimension it is different caused by difference, can respectively to first the second preference of preference vector sum being calculated in step 1 to
Amount is normalized, and the calculating of the degree of correlation is participated in using the preference vector after normalized.Wherein, common normalizing
Change method has min-max standardization, the conversion of log function, z-score standardization etc., and the process of normalized is no longer superfluous herein
It states.
Sample calculation
|
Video 1 |
Video 2 |
Video 3 |
Video 4 |
Music 1 |
Music 2 |
Music 3 |
User 1 |
2 |
0 |
0 |
2 |
4 |
5 |
0 |
User 2 |
5 |
0 |
4 |
0 |
0 |
0 |
1 |
User 3 |
0 |
0 |
5 |
0 |
0 |
2 |
0 |
User 4 |
0 |
1 |
0 |
3 |
5 |
0 |
4 |
User 5 |
0 |
0 |
0 |
0 |
4 |
2 |
0 |
Table two
Article digital representation corresponding to the user preference of the user to article in table two.According to the preference meter of table two
Calculate result it can be concluded that, the preference vector of video 1 is { 2,5,0,0,0 }, and the preference vector of video 2 is { 0,0,0,1,0 },
The preference vector of video 3 is { 0,4,5,0,0 }, and the preference vector of video 4 is { 2,0,0,3,0 }, the preference of music 1 to
Amount is { 4,0,0,5,4 }, and the preference vector of music 2 is { 5,0,2,0,2 }, the preference vector of music 3 be 0,1,0,4,
0}。
It can use the degree of correlation between the article under cosine similarity calculation calculating different field, calculated result is as follows
Shown in table three, the not reinflated description of specific calculating process.
|
Video 1 |
Video 2 |
Video 3 |
Video 4 |
Music 1 |
0.20 |
0.66 |
0.00 |
0.84 |
Music 2 |
0.32 |
0.00 |
0.27 |
0.48 |
Music 3 |
0.22 |
0.97 |
0.15 |
0.81 |
Table three
So far by taking the first field and the second field as an example, the determination process of the degree of correlation between cross-cutting article is described.It needs
Illustrate it is that, using the relation excavation scheme of the disclosure, can excavate for any two field in multiple and different fields
The degree of correlation between the article in the two fields.
Furthermore it is also possible to otherwise determine different field article between the degree of correlation.For example, being directed to different field
Article, the label or keyword that its attribute can be characterized under a variety of dimensions can be extracted, as topic model can be taken
Mode maps that on a label according to the element property information of article.The mode that seq2vec can also be taken, according to
The element property information of article maps that in a vector.The label or vector of the different articles of calculating can so be passed through
Similarity degree, determine the degree of correlation between the article of different field.
As an example of the disclosure, the keyword of the article of different field can be extracted, by analyzing different field
Article between keyword similarity degree, determine the degree of correlation between the article of different field.Specifically, for the first field
The first interior field article can extract one or more keywords of the first field article, generate the first crucial term vector.Needle
To the second field article in the second field, one or more keywords of the second field article can be extracted, second is generated and closes
Keyword vector.It is calculated between first the second crucial term vector of keyword vector sum by can use cosine similarity calculation
Similarity degree, can also so determine the degree of correlation between the first field article and the second field article.
Design based on the disclosure can also determine the degree of correlation between cross-cutting article, herein by a variety of other ways
It repeats no more.
As an example of the disclosure, user can be obtained respectively and is directed to one for each user in multiple users
A or multiple first fields article the first behavioral data and user are directed to the second row of one or more second field articles
For data.
Wherein, multiple users described herein preferably in the first field and all have behavioral data in the second field
User.First field is different from the second field.According to the description of the division mode above to field it is found that the first field and
Two fields can refer to different application, the disparate modules being also possible in same application, can also be different types of application, or
Person is according to the different field of Attribute transposition, such as music, video, picture.
The first behavioral data and the second behavioral data can be collected by client log acquisition system.And passing through visitor
When the log collection behavioral data of family end, client log can also be cleaned, it is different to filter out wherein user's exception, user's operation
Often, invalid log caused by server exception etc..First behavioral data described herein, the second behavioral data refer to user in phase
Total behavioral data in field is answered, may relate to one or more articles.
The first behavioral data and the second behavioral data based on the multiple user can use and describe above in association with Fig. 1
Relation excavation method, determine at least partly each of first field article with it is every at least partly the second field article
The degree of correlation between one.Wherein details are not described herein again for the specific determination process of the degree of correlation.
[cross-cutting article recommendation]
Fig. 2 is to show the schematic flow chart of the item recommendation method according to one embodiment of the disclosure.
Referring to fig. 2, in step S210, first behavioral data of the user in the first field is obtained, the first behavioral data relates to
And one or more first field articles.
In step S220, based at least one of one or more of first fields article respectively at least one the
The degree of correlation between each of two field articles chooses the second field article from least one second field article.
User in the present embodiment refers to that the user for lacking behavioral data in the second field, i.e. user can be considered as
New user in two fields.When recommending the second field article in the second field for user, faces user and be cold-started problem.And
Then there is no users to be cold-started problem in the first field by user.
It is different from the first field described in the relation excavation scheme above between the article of different field, the present embodiment
In the first field can refer to the known arts that there is behavioral data different from other one or more users in the second field.
That is, when recommending the second field article in the second field for user, it can be based on user in behavior number
According to some or all articles involved in the behavioral data in known one or more of the other field, chosen out of second field
The higher second field article of the degree of correlation for some or all articles being related in other fields with user as be suitable for
The article that family is recommended.
Wherein, the degree of correlation between the first field article and the second field article can be predetermined, such as can be with
It is that the method for digging of relationship between the article using the different field addressed above obtains.
In step S230, recommend the second selected field article to user.
As a result, when carrying out content (article) recommendation in some tera incognita for user, user can use at other
The known behavioral data of content zone finds similar article by cross-cutting, the interest maps by user in other content field
Into tera incognita, so as to solve the problems, such as that the user in tera incognita is cold-started, user experience is promoted.
As an example of the disclosure, when choosing the second field article, can by calculate it is described at least one the
The recommendation of each the second field article in two field articles is chosen in the top according to the sequence that recommendation is descending
Predetermined quantity the second field article.
Wherein, the article that the recommendation of the second field article can be related to the first behavioral data of user respectively is concentrated to
Each of small part (preferably all) first field articles and the degree of correlation of the second field article are positively correlated.
The first behavioral data of user is related to for example, the recommendation of the second field article can be equal to the second field article
Each first field article sub- recommendation summation.Wherein, sub- recommendation respectively with the first field article and the second field
The degree of correlation of article and user are positively correlated the preference of the first field article.
The recommendation that following formula calculates the second field article specifically can be used,
Wherein, recujUser u is indicated to the recommendation of the second field article j, I is the first row of the user in the first field
For the set of the first field article involved in data, i is the first field article, sim (i, j) indicate the first field article i and
The degree of correlation between second field article j, ruiIndicate user u to the preference of the first field article i.Wherein, preference can be with
It is considered as the weight of the degree of correlation sim (i, j) between the first field article i and the second field article j, the calculating side about preference
Formula may refer to related description above, and details are not described herein again.
Concrete application example
The disclosure can be used for solving the problems, such as the mobile phone being cold-started there are user, plate, computer, TV, intelligent sound box, intelligence
During the applications such as video, music, news, application, game, theme in the various electronics such as energy wrist-watch are recommended.
Fig. 3 is the integrally reality of relation excavation scheme and user's cold start-up scheme between the cross-cutting article for showing the disclosure
Existing flow chart.It is shown in Fig. 3 to realize that steps are as follows.
Step 1 collects the user behaviors log that user generates
It can collect user by client log acquisition system and be directed to article in the different field in various terminals
The behavioral datas such as click, broadcasting, evaluation.
Step 2, calculates preference data at log cleaning
Original log is cleaned first, filters out invalid day caused by abnormal user, maloperation, server exception etc.
Will.Then by analyzing behavioral data, preference of the available user to article.Wherein, the calculating side about preference
Formula, details are not described herein again.
Step 3 calculates the relation data between cross-domain article according to preference data.
The preference data obtained according to step 2, it can be deduced that the article of different field is in multiple users (User, i.e., in figure
The User1 to User5 shown) under preference vector.As shown in figure 3, the preference vector of Video1 is { 2,5,0,0,0 },
The preference vector of Video2 is { 0,0,0,1,0 }, and the preference vector of Video3 is { 0,4,5,0,0 }, the preference of Video4
Spending vector is { 2,0,0,3,0 }, and the preference vector of Music1 is { 4,0,0,5,4 }, the preference vector of Music2 be 5,0,
2,0,2 }, the preference vector of Music3 is { 0,1,0,4,0 }.
It can use Cosine similarity calculation mode to calculate between the corresponding preference vector of article of different field
Similarity, using as the degree of correlation between different field article, so as to obtain the relation data between different field article.
Step 4, the recommendation for calculating article
User5 does not have a behavioral data in video field (Video), thus User5 can be considered as it is new in video field
User faces cold start-up problem when recommending video for User5.
It can be according to behavioral data of the User5 in music field and predetermined video field and music field
In cross-cutting article between relation data, the recommendation of different Video is calculated for User5.
Specifically, following formula can be used and calculate User5 to the recommendation of different Video.
Herein, recujUser5 is indicated to the recommendation of video j, I is that behavioral data of the User5 in music field is related to
Music set, be { Music1, Music2 }.Sim (i, j) indicates the degree of correlation between video j and music i, ruiIt indicates
Preference of the User5 to music i.
As shown in figure 3, the expansion formula for calculating the recommendation (i.e. recommendation) of Video1 is (similarity_
mlv1)·(value_m1)+(similarity_m2v1)·(value_m2)+(similarity_m3v1)·(value_
M3), wherein similarity_mlv1 indicates the degree of correlation between Music1 and Video1, and value_m1 indicates User5 pairs
The preference of Music1.Imilarity_m2v1 indicates the degree of correlation between Music2 and Video1, and value_m2 indicates User5
To the preference of Music2.Imilarity_m3v1 indicates the degree of correlation between Music3 and Video1, and value_m3 is indicated
Preference of the User5 to Music3.
The User5 being finally calculated using above-mentioned calculation is User5 pairs to the recommendation of different Video
It is 2.4, User5 to the recommendation of Video2 is 05 to the recommendation of Video3 that the recommendation of Video1, which is 1.4, User5,
User5 is 4.2 to the recommendation of Video4.
Step 5, according to recommendation ranking, choose article and recommended
As shown in figure 3, can be arranged according to the descending sequence of the recommendation of article, article in the top can
To show user as recommendation list.Such as Video4, Video2 in the top can be recommended into User5.
Accordingly, for the new user for lacking user behavior data in target domain, non-personalized letter can only see originally
Breath then can see personalized information recommendation result using the disclosure.As shown in Fig. 4 A, Fig. 4 B, although user does not use
Video hub is crossed, but novel " the inner peach blossom of three lives three generations ten " has been seen according to user, then opens Video Applications in user and " guesses that you like
Vigorously " after module, it can be seen that the TV play " the inner peach blossom of three lives three generations ten " of recommendation.
To sum up, the data of other field can be used to supply target domain user behavior data deficiency problem in the disclosure,
It solves the problems, such as cold start-up of the user in recommender system, promotes experience of the user in recommender system.
So far, the relationship being above described in detail by reference to Fig. 1 to Fig. 3 between the article of the different field of the disclosure is dug
Pick method and item recommendation method.Relation excavation dress between the article of the different field of the disclosure is described below with reference to Fig. 5 to Fig. 8
Set, article recommendation apparatus and calculate equipment.
[relation excavation device]
Fig. 5 is the schematic block diagram of the structure of the relation excavation device between the article for the different field for showing the disclosure.
Wherein, the details in relation to content is identical as the description hereinbefore with reference to Fig. 1, and details are not described herein.
Referring to Fig. 5, relation excavation device 300 may include that behavioural information obtains module 310 and degree of correlation determining module
320。
Behavioural information obtain module 310 can be used for obtaining user for the first field article the first behavioural information, with
And the user is directed to the second behavioural information of the second field article;
Degree of correlation determining module can the first behavioural information and the second behavioural information based on multiple users, determine the first neck
The degree of correlation between domain article and the second field article.
As shown in figure 5, degree of correlation determining module 320 can optionally include the first behavioural characteristic shown in dotted line frame in figure
It is distributed determination unit 321, the second behavioural characteristic distribution determination unit 323 and degree of correlation determination unit 325.
First behavioural characteristic distribution determination unit 321 can determine the first neck based on the first behavioural information of multiple users
Domain article is distributed relative to the first behavioural characteristic of multiple users.
Second behavioural characteristic distribution determination unit 323 can determine the second neck based on the second behavioural information of multiple users
Domain article is distributed relative to the second behavioural characteristic of multiple users.
Degree of correlation determination unit 325 can be distributed according to the first behavioural characteristic of the first field article and the second field article
The second behavioural characteristic distribution similarity degree, determine the degree of correlation between the first field article and the second field article.
The distribution of first behavioural characteristic and/or the distribution of the second behavioural characteristic may include following one or more: user is to object
Whether product perform behavior, user to the behavior number of article, user to the preference of article.
As an example, the distribution of the first behavioural characteristic may include in multiple users each user to the of the first field article
One preference, the second behavioural characteristic distribution may include in multiple users each user to the second preference of the second field article
Degree.
User can be equal to user for each behavior in some or all behavior types of article to the preference of article
The summation of the corresponding sub- preference of type, wherein sub- preference is positively correlated with behavior number and behavior weight respectively.For example, the
One behavioural characteristic distribution determination unit 321 and/or the second behavioural characteristic distribution determination unit 323 can be used following formula and calculate
User is directed to the preference r of article,
Wherein, T is all behavior types that user is directed to article, and t is different behavior types, qtFor under behavior type t
Behavior number, WtFor the corresponding behavior weight of behavior type t.
Degree of correlation determination unit 325 may include that vector establishes unit 3251 and correlation calculating unit 3253.
Vector establishes unit 3251 for establishing the first preference vector that multiple users are directed to the first field article respectively
With the second preference vector of the second field article.
Correlation calculating unit 3253 can be by calculating the phase between first preference vector sum the second preference vector
Like degree, the degree of correlation between the first field article and the second field article is determined.
As shown in figure 5, relation excavation device 300 can also optionally include normalized mould shown in dotted line frame in figure
Block 330.Can place be normalized to first preference vector sum the second preference vector respectively in normalized module 330
Reason, correlation calculating unit 3253 can calculate between first preference vector sum the second preference vector after normalized
Similarity, as the degree of correlation between the first field article and the second field article.
Fig. 6 shows the schematic block diagram of the structure of the relation excavation device between the article of the different field of the disclosure.Its
In, the details in relation to content is identical as the description hereinbefore with reference to Fig. 1, and details are not described herein.
Referring to Fig. 6, relation excavation device 600 may include that behavioral data obtains module 610 and degree of correlation determining module
620。
Behavioral data obtains module 610 and is used for for each user in multiple users, obtains user respectively in the first field
It is interior that one or more is directed in the second field for the first behavioral data of one or more first field articles and user
Second behavioral data of the second field article.
First behavioral data and the second behavioral data may include following one or more: behavior type, behavior number, row
For duration.
Degree of correlation determining module 620 is used for the first behavioral data and the second behavioral data based on multiple users, determines extremely
The degree of correlation between each of the first field of small part article and at least partly each of second field article.Its
In, determine that the specific method of determination of the degree of correlation between the first field article and the second field article may refer to mutually speak on somebody's behalf above
Bright, details are not described herein again.
[article recommendation apparatus]
Fig. 7 is to show the schematic block diagram of the structure of article recommendation apparatus of the disclosure.Wherein, in relation to the details of content
Identical as the description hereinbefore with reference to Fig. 2, details are not described herein.
Referring to Fig. 7, article recommendation apparatus 400 may include the first behavior data acquisition module 410, article selection module
420 and article recommending module 430.
First behavioral data of the available user of first behavior data acquisition module 410 in the first field, the first row
It is related to one or more first field articles for data.
Article chooses module 420 can be based at least one of one or more of first fields article respectively and extremely
The degree of correlation between each of few second field article, chooses second from least one described second field article
Field article.Wherein, the degree of correlation between the first field article and the second field article, which can be, utilizes the relationship addressed above
What method for digging obtained.
Article recommending module 430 can be used for recommending the second selected field article to user.
As shown in fig. 7, article, which chooses module 420, can also optionally include the calculating of recommendation shown in dotted line frame list in figure
Member 421 and article selection unit 423.
Recommendation computing unit 421 can be used for the recommendation of each second field article.Article selection unit 423 can be with
According to the sequence that recommendation is descending, the second field article of predetermined quantity in the top is chosen.
Wherein, each first neck that the recommendation of each second field article can be related to the first behavioral data of user
The degree of correlation of domain article and the second field article is positively correlated.
As an example, the recommendation of the second field article be equal to the second field article to it is described at least one each of
The summation of the sub- recommendation of first field article, the sub- recommendation degree of correlation with the first field article and the second field article respectively
And user is positively correlated the preference of the first field article.
For example, the recommendation that following formula calculates the second field article can be used in recommendation computing unit 421,
Wherein, recujUser u is indicated to the recommendation of the second field article j, I is the first row of the user in the first field
For the set of the first field article involved in data, i is the first field article, sim (i, j) indicate the first field article i and
The degree of correlation between second field article j, ruiIndicate user u to the preference of the first field article i.
[calculating equipment]
A kind of role's identification model training method and information that can be used for executing the disclosure is additionally provided according to the disclosure
The calculating equipment of recommended method.
Fig. 8 can be used for executing the relation excavation method and item recommendation method between the article of the different field of the disclosure
Calculating equipment schematic block diagram.
As shown in figure 8, the calculating equipment 500 may include processor 510 and memory 530.It is stored on memory 530
Executable code.When processor 510 executes the executable code, so that processor 510 executes relation described above and excavates
Method and item recommendation method.
Relation excavation and recommended method, dress between article according to the present invention above is described in detail by reference to attached drawing
It sets, calculate equipment.
In addition, being also implemented as a kind of computer program or computer program product, the meter according to the method for the present invention
Calculation machine program or computer program product include the calculating for executing the above steps limited in the above method of the invention
Machine program code instruction.
Alternatively, the present invention can also be embodied as a kind of (or the computer-readable storage of non-transitory machinable medium
Medium or machine readable storage medium), it is stored thereon with executable code (or computer program or computer instruction code),
When the executable code (or computer program or computer instruction code) by electronic equipment (or calculate equipment, server
Deng) processor execute when, so that the processor is executed each step according to the above method of the present invention.
Those skilled in the art will also understand is that, various illustrative logical blocks, mould in conjunction with described in disclosure herein
Block, circuit and algorithm steps may be implemented as the combination of electronic hardware, computer software or both.
The flow chart and block diagram in the drawings show the possibility of the system and method for multiple embodiments according to the present invention realities
Existing architecture, function and operation.In this regard, each box in flowchart or block diagram can represent module, a journey
A part of sequence section or code, a part of the module, section or code include one or more for realizing defined
The executable instruction of logic function.It should also be noted that in some implementations as replacements, the function of being marked in box can also
To be occurred with being different from the sequence marked in attached drawing.For example, two continuous boxes can actually be basically executed in parallel,
They can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or stream
The combination of each box in journey figure and the box in block diagram and or flow chart, can the functions or operations as defined in executing
Dedicated hardware based system realize, or can realize using a combination of dedicated hardware and computer instructions.
Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport
In the principle, practical application or improvement to the technology in market for best explaining each embodiment, or make the art
Other those of ordinary skill can understand each embodiment disclosed herein.